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Preface 


This book originated in part from lecture notes we developed while teaching courses in 
financial mathematics in the Master of Mathematical Finance Program at the University of 
Toronto during the years from 1998 to 2003. We were confronted with the challenge of 
teaching a varied set of finance topics, ranging from derivative pricing to risk management, 
while developing the necessary notions in probability theory, stochastic calculus, statistics, 
and numerical analysis and while having the students acquire practical computer laboratory 
experience in the implementation of financial models. The amount of material to be covered 
spans a daunting number of topics. The leading motives are recent discoveries in derivatives 
research, whose comprehension requires an array of applied mathematical techniques tradi- 
tionally taught in a variety of different graduate and senior undergraduate courses, often not 
included in the realm of traditional finance education. Our choice was to teach all the relevant 
topics in the context of financial engineering and mathematical finance while delegating more 
systematic treatments of the supporting disciplines, such as probability, statistics, numerical 
analysis, and financial markets and institutions, to parallel courses. Our project turned from 
a challenge into an interesting and rewarding teaching experience. We discovered that prob- 
ability and stochastic calculus, when presented in the context of derivative pricing, are easier 
to teach than we had anticipated. Most students find financial concepts and situations helpful 
to develop an intuition and understanding of the mathematics. A formal course in probability 
running in parallel introduced the students to the mathematical theory of stochastic calculus, 
but only after they already had acquired the basic problem-solving skills. Computer laboratory 
projects were run in parallel and took students through the actual “hands-on” implementation 
of the theory through a series of financial models. Practical notions of information technology 
were introduced in the laboratory as well as the basics in applied statistics and numerical 
analysis. 

This book is organized into two main parts: Part I consists of the main body of the theory 
and mathematical tools, and Part II covers a series of numerical implementation projects 
for laboratory instruction. The first part is organized into rather large chapters that span the 
main topics, which in turn consist of a series of related subtopics or sections. Chapter 1 
introduces the basic notions of pricing theory together with probability and stochastic calculus. 
The relevant notions in probability and stochastic calculus are introduced in the finance 
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context. Students learn about static and dynamic hedging strategies and develop an underlying 
framework for pricing various European-style contracts, including quanto and basket options. 
The martingale (or probabilistic) and Partial differential equation (PDE) formulations are 
presented as alternative approaches for derivatives pricing. The last part of Chapter 1 provides 
a theoretical framework for pricing American options. Chapter 2 is devoted to fixed-income 
derivatives. Numerical solution methods such as lattice models, model calibration, and Monte 
Carlo simulations are introduced within relevant projects in the second part of the book. 
Chapter 3 is devoted to more advanced mathematical topics in option pricing, covering some 
techniques for exact exotic option pricing within continuous-time state-dependent diffusion 
models. A substantial part of Chapter 3 is drawn partly from some of our recent research 
and hence covers derivations of new pricing formulas for complex state-dependent diffusion 
models for European-style contracts as well as barrier options. One focus of this chapter is to 
expose the reader to some of the more advanced, yet essential, mathematical tools for tackling 
derivative pricing problems that lie beyond the standard contracts and/or simpler models. 
Although the technical content in Chapter 3 may be relatively high, our goal has been to 
present the material in a comprehensive fashion. Chapter 4 reviews numerical methods and 
statistical estimation methodologies for value-at-risk and risk management. 

Part II includes a dozen shorter “chapters,” each one dedicated to a numerical laboratory 
project. The additional files distributed in the attached disk give the documentation and 
framework as they were developed for the students. We made an effort to cover a broad 
variety of information technology topics, to make sure that the students acquire the basic 
programming skills required by a professional financial engineer, such as the ability to design 
an interface for a pricing module, produce scenario-generation engines for pricing and risk 
management, and access a host of numerical library components, such as linear algebra 
routines. In keeping with the general approach of this book, students acquire these skills not 
in isolation but, rather, in the context of concrete implementation tasks for pricing and risk 
management models. 

This book can presumably be read and used in a variety of ways. In the mathematical 
finance program, Chapters 1 and 2, and limited parts of Chapters 3 and 4 formed the core of 
the theory course. All the chapters (i.e., projects) in Part II were used in the parallel numerical 
laboratory course. Some of the material in Chapter 3 can be used as a basis for a separate 
graduate course in advanced topics in pricing theory. Since Chapter 4, on value-at-risk, is 
largely independent of the other ones, it may also possibly be covered in a parallel risk 
management course. 

The laboratory material has been organized in a series of modules for classroom instruction 
we refer to as projects (i.e., numerical laboratory projects). These projects serve to provide 
the student or practitioner with an initial experience in actual quantitative implementations 
of pricing and risk management. Admittedly, the initial projects are quite far from being 
realistic financial engineering problems, for they were devised mostly for pedagogical reasons 
to make students familiar with the most basic concepts and the programming environment. 
We thought that a key feature of this book was to keep the prerequisites to a bare minimum 
and not assume that all students have advanced programming skills. As the student proceeds 
further, the exercises become more challenging and resemble realistic situations more closely. 
The projects were designed to cover a reasonable spectrum of some of the basic topics 
introduced in Part I so as to enhance and augment the student’s knowledge in various basic 
topics. For example, students learn about static hedging strategies by studying problems 
with barrier options and variance swaps, learn how to design and calibrate lattice models 
and use them to price American and other exotics, learn how to back out a high-precision 
LIBOR zero-yield curve from swap and forward rates, learn how to set up and calibrate 
interest rate trees for pricing interest rate derivatives using a variety of one-factor short rate 
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models, and learn about estimation and simulation methodologies for value-at-risk. As the 
assignments progress, relevant programming topics may be introduced in parallel. Our choice 
fell on the Microsoft technologies because they provide perhaps the easiest-to-learn-about 
rapid application development frameworks; however, the concepts that students learn also 
have analogues with other technologies. Students learn gradually how to design the interface 
for a pricing model using spreadsheets. Most importantly, they learn how to invoke and use 
numerical libraries, including LAPACK, the standard numerical linear algebra package, as 
well as a broad variety of random- and quasi-random-number generators, zero finders and 
optimizer routines, spline interpolations, etc. To a large extent, technologies can be replaced. 
We have chosen Microsoft Excel as a graphic user interface as well as a programming tool. 
This should give most PC users the opportunity to quickly gain familiarity with the code 
and to modify and experiment with it as desired. The Math Point libraries for visual basic 
(VB) and visual Basic for applications (VBA), which are used in our laboratory materials, 
were developed specifically for this teaching project, but an experienced programmer could 
still use this book and work in alternative frameworks, such as the Nag FORTRAN libraries 
under Linux and Java. The main motive of the book also applies in this case: We teach the 
relevant concepts in information technology, which are a necessary part of the professional 
toolkit of financial engineers, by following what according to our experience is the path of 
least resistance in the learning process. 

Finally, we would like to add numerous acknowledgments to all those who made this 
project a successful experience. Special thanks go to the students who attended the Master of 
Mathematical Finance Program at the University of Toronto in the years from 1998 to 2003. 
They are the ones who made this project come to life in the first place. We thank Oliver Chen 
and Stephan Lawi for having taught the laboratory course in the fifth year of the program. 
We thank Petter Wiberg, who agreed to make the material in his Ph.D. thesis available to 
us for partial use in Chapter 4. We thank our coauthors in the research papers we wrote 
over the years, including Peter Carr, Oliver Chen, Ken Jackson, Alexei Kusnetzov, Pierre 
Hauvillier, Stephan Lawi, Alex Lipton, Roman Makarov, Smaranda Paun, Dmitri Rubisov, 
Alexei Tchernitser, Petter Wiberg, and Andrei Zavidonov. 
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CHAP rE RJ 


Pricing Theory 


Pricing theory for derivative securities is a highly technical topic in finance; its foundations 
rest on trading practices and its theory relies on advanced methods from stochastic calculus 
and numerical analysis. This chapter summarizes the main concepts while presenting the 
essential theory and basic mathematical tools for which the modeling and pricing of financial 
derivatives can be achieved. 

Financial assets are subdivided into several classes, some being quite basic while others are 
structured as complex contracts referring to more elementary assets. Examples of elementary 
asset classes include stocks, which are ownership rights to a corporate entity; bonds, which 
are promises by one party to make cash payments to another in the future; commodities, 
which are assets, such as wheat, metals, and oil that can be consumed; and real estate assets, 
which have a convenience yield deriving from their use. A more general example of an asset 
is that of a contractual contingent claim associated with the obligation of one party to enter 
a stream of more elementary financial transactions, such as cash payments or deliveries of 
shares, with another party at future dates. The value of an individual transaction is called a 
pay-off or payout. Mathematically, a pay-off can be modeled by means of a payoff function 
in terms of the prices of other, more elementary assets. 

There are numerous examples of contingent claims. Insurance policies, for instance, are 
structured as contracts that envision a payment by the insurer to the insured in case a specific 
event happens, such as a car accident or an illness, and whose pay-off is typically linked to the 
damage suffered by the insured party. Derivative assets are claims that distinguish themselves 
by the property that the payoff function is expressed in terms of the price of an underlying 
asset. In finance jargon, one often refers to underlying assets simply as underlyings. To 
some extent, there is an overlap between insurance policies and derivative assets, except the 
nomenclature differs because the first are marketed by insurance companies while the latter 
are traded by banks. 

A trading strategy consists of a set of rules indicating what positions to take in response 
to changing market conditions. For instance, a rule could say that one has to adjust the 
position in a given stock or bond on a daily basis to a level given by evaluating a certain 
function. The implementation of a trading strategy results in pay-offs that are typically 
random. A major difference that distinguishes derivative instruments from insurance contracts 
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is that most traded derivatives are structured in such a way that it is possible to implement 
trading strategies in the underlying assets that generate streams of pay-offs that replicate the 
pay-offs of the derivative claim. In this sense, trading strategies are substitutes for derivative 
claims. One of the driving forces behind derivatives markets is that some market participants, 
such as market makers, have a competitive advantage in implementing replication strategies, 
while their clients are interested in taking certain complex risk exposures synthetically by 
entering into a single contract. 

A key property of replicable derivatives is that the corresponding payoff functions depend 
only on prices of tradable assets, such as stocks and bonds, and are not affected by events, 
such as car accidents or individual health conditions that are not directly linked to an asset 
price. In the latter case, risk can be reduced only by diversification and reinsurance. A related 
concept is that of portfolio immunization, which is defined as a trade intended to offset the 
risk of a portfolio over at least a short time horizon. A perfect replication strategy for a given 
claim is one for which a position in the strategy combined with an offsetting position in the 
claim are perfectly immunized, i.e., risk free. The position in an asset that immunizes a given 
portfolio against a certain risk is traditionally called hedge ratio.! An immunizing trade is 
called a hedge. One distinguishes between static and dynamic hedging, depending on whether 
the hedge trades can be executed only once or instead are carried over time while making 
adjustments to respond to new information. 

The assets traded to execute a replication strategy are called hedging instruments. A set of 
hedging instruments in a financial model is complete if all derivative assets can be replicated 
by means of a trading strategy involving only positions in that set. In the following, we shall 
define the mathematical notion of financial models by listing a set of hedging instruments 
and assuming that there are no redundancies, in the sense that no hedging instrument can 
be replicated by means of a strategy in the other ones. Another very common expression 
is that of risk factor: The risk factors underlying a given financial model with a complete 
basis of hedging instruments are given by the prices of the hedging instruments themselves 
or functions thereof; as these prices change, risk factor values also change and the prices of 
all other derivative assets change accordingly. The statistical analysis of risk factors allows 
one to assess the risk of financial holdings. 

Transaction costs are impediments to the execution of replication strategies and correspond 
to costs associated with adjusting a position in the hedging instruments. The market for a 
given asset is perfectly liquid if unlimited amounts of the asset can be traded without affecting 
the asset price. An important notion in finance is that of arbitrage: If an asset is replicable by 
a trading strategy and if the price of the asset is different from that of the replicating strategy, 
the opportunity for riskless gains/profits arises. Practical limitations to the size of possible 
gains are, however, placed by the inaccuracy of replication strategies due to either market 
incompleteness or lack of liquidity. In such situations, either riskless replication strategies are 
not possible or prices move in response to posting large trades. For these reasons, arbitrage 
opportunities are typically short lived in real markets. 

Most financial models in pricing theory account for finite liquidity indirectly, by postu- 
lating that prices are arbitrage free. Also, market incompleteness is accounted for indirectly 
and is reflected in corrections to the probability distributions in the price processes. In this 
stylized mathematical framework, each asset has a unique price.” 


l Notice that the term hedge ratio is part of the finance jargon. As we shall see, in certain situations hedge ratios 
are computed as mathematical ratios or limits thereof, such as derivatives. In other cases, expressions are more 
complicated. 

?To avoid the perception of a linguistic ambiguity, when in the following we state that a given asset is worth a 
certain amount, we mean that amount is the asset price. 
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Most financial models are built upon the perfect-markets hypothesis, according to which: 


e There are no trading impediments such as transaction costs. 
e The set of basic hedging instruments is complete. 

e Liquidity is infinite. 

e No arbitrage opportunities are present. 


These hypotheses are robust in several ways. If liquidity is not perfect, then arbitrage oppor- 
tunities are short lived because of the actions of arbitrageurs. The lack of completeness and 
the presence of transaction costs impacts prices in a way that is uniform across classes of 
derivative assets and can safely be accounted for implicitly by adjusting the process proba- 
bilities. 

The existence of replication strategies, combined with the perfect-markets hypothesis, 
makes it possible to apply more sophisticated pricing methodologies to financial derivatives 
than is generally possible to devise for insurance claims and more basic assets, such as stocks. 
The key to finding derivative prices is to construct mathematical models for the underlying 
asset price processes and the replication strategies. Other sources of information, such as a 
country’s domestic product or a takeover announcement, although possibly relevant to the 
underlying prices, affect derivative prices only indirectly. 

This first chapter introduces the reader to the mathematical framework of pricing theory 
in parallel with the relevant notions of probability, stochastic calculus, and stochastic control 
theory. The dynamic evolution of the risk factors underlying derivative prices is random, i.e., 
not deterministic, and is subject to uncertainty. Mathematically, one uses stochastic processes, 
defined as random variables with probability distributions on sets of paths. Replicating and 
hedging strategies are formulated as sets of rules to be followed in response to changing price 
levels. The key principle of pricing theory is that if a given payoff stream can be replicated 
by means of a dynamic trading strategy, then the cost of executing the strategy must equal 
the price of a contractual claim to the payoff stream itself. Otherwise, arbitrage opportunities 
would ensue. Hence pricing can be reduced to a mathematical optimization problem: to 
replicate a certain payoff function while minimizing at the same time replication costs and 
replication risks. In perfect markets one can show that one can achieve perfect replication at 
a finite cost, while if there are imperfections one will have to find the right trade-off between 
risk and cost. The fundamental theorem of asset pricing is a far-reaching mathematical result 
that states; 


e The solution of this optimization problem can be expressed in terms of a discounted 
expectation of future pay-offs under a pricing (or probability) measure. 

e This representation is unique (with respect to a given discounting) as long as markets 
are complete. 


Discounting can be achieved in various ways: using a bond, using the money market account, 
or in general using a reference numeraire asset whose price is positive. This is because pricing 
assets is a relative, as opposed to an absolute, concept: One values an asset by computing its 
worth as compared to that of another asset. A key point is that expectations used in pricing 
theory are computed under a probability measure tailored to the numeraire asset. 

In this chapter, we start the discussion with a simple single-period model, where trades 
can be carried out only at one point in time and gains or losses are observed at a later 
time, a fixed date in the future. In this context, we discuss static hedging strategies. We then 
briefly review some of the relevant and most basic elements of probability theory in the 


6 CHAPTER 1 . Pricing theory 


context of multivariate continuous random variables. Brownian motion and martingales are 
then discussed as an introduction to stochastic processes. We then move on to further discuss 
continuous-time stochastic processes and review the basic framework of stochastic (Itô) 
calculus. Geometric Brownian motion is then presented, with some preliminary derivations 
of Black-Scholes formulas for single-asset and multiasset price models. We then proceed 
to introduce a more general mathematical framework for dynamic hedging and derive the 
fundamental theorem of asset pricing (FTAP) for continuous-state-space and continuous- 
time-diffusion processes. We then apply the FTAP to European-style options. Namely, by the 
use of change of numeraire and stochastic calculus techniques, we show how exact pricing 
formulas based on geometric Brownian motions for the underlying assets are obtained for a 
variety of situations, ranging from elementary stock options to foreign exchange and quanto 
options. The partial differential equation approach for option pricing is then presented. We 
then discuss pricing theory for early-exercise or American-style options. 


1.1 Single-Period Finite Financial Models 


The simplest framework in pricing theory is given by single-period financial models, in which 
calendar time f is restricted to take only two values, current time t = 0 and a future date 
t= T > 0. Such models are appropriate for analyzing situations where trades can be made 
only at current time t = 0. Revenues (i.e., profits or losses) can be realized only at the later 
date T, while trades at intermediate times are not allowed. 

In this section, we focus on the particular case in which only a finite number of scenarios 


@,,...,@,, can occur. Scenario is a common term for an outcome or event. The scenario set 
Q = {w,,...,@,,} is also called the probability space. A probability measure P is given by 
a set of numbers p;,i=1,...,m, in the interval [0, 1] that sum up to 1; i.e., 

m 

pee O<p; <l. (1.1) 

i=l 


p; is the probability that scenario (event) œ; occurs, i.e., that the ith state is attained. Scenario 
w; is possible if it can occur with strictly positive probability p; > 0. Neglecting scenarios that 
cannot possibly occur, the probabilities p; will henceforth be assumed to be strictly positive; 
i.e., p; > 0. A random variable is a function on the scenario set, f: Q —> R, whose values 
f(@;) represent observables. As we discuss later in more detail, examples of random variables 
one encounters in finance include the price of an asset or an interest rate at some point in 
the future or the pay-off of a derivative contract. The expectation of the random variable fis 
defined as the sum 


m 
E'I = > pifo). (1.2) 

i=l 
Asset prices and other financial observables, such as interest rates, are modeled by 
stochastic processes. In a single-period model, a stochastic process is given by a value fo 
at current time ¢ = 0 and by a random variable fp that models possible values at time T. In 
finance, probabilities are obtained with two basically different procedures: They can either 
be inferred from historical data by estimating a statistical model, or they can be implied from 
current asset valuations by calibrating a pricing model. The former are called historical, 
statistical, or, better, real-world probabilities. The latter are called implied probabilities. 
The calibration procedure involves using the fundamental theorem of asset pricing to represent 
prices as discounted expectations of future pay-offs and represents one of the central topics 

to be discussed in the rest of this chapter. 
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Definition 1.1. Financial Model A finite, single-period financial model M = (Q, A) is given 
by a finite scenario set Q = {,,...,@,,} and n basic asset price processes for hedging 
instruments: 


A={Al,..., A" t=0, Th. (1.3) 


Here, Aj, models the current price of the ith asset at current (or initial) time t = 0 and A’, 
is a random variable such that the price at time T > 0 of the ith asset in case scenario w; 
occurs is given by Aj,(@)). The basic asset prices Ai, i=1,...,n, are assumed real and 
positive. 


Definition 1.2. Portfolio and Asset Let M = (Q, A) be a financial model. A portfolio 7 
is given by a vector with components m; € R,i=1,...,n, representing the positions or 
holdings in the the family of basic assets with prices A}, ..., A". The worth of the portfolio at 
terminal time T is given by )°7_, T; A (œ) given the state or scenario w, whereas the current 
price is )~"_, T; Aj. A portfolio is nonnegative if it gives rise to nonnegative pay-offs under 
all scenarios, i.e., X; T; Ai (@;) >0, Vj=1,...,m. An asset price process A, = A,(w) 
(a generic one, not necessarily that of a hedging instrument) is a process of the form 


A, =9 TAi (1.4) 
i=1 


for some portfolio m € R”. 


The modeling assumption behind this definition is that market liquidity is infinite, meaning 
that asset prices don’t vary as a consequence of agents trading them. As we discussed at the 
start of this chapter, this hypothesis is valid only in case trades are relatively small, for large 
trades cause market prices to change. In addition, a financial model with infinite liquidity is 
mathematically consistent only if there are no arbitrage opportunities. 


Definition 1.3. Arbitrage: Single-Period Discrete Case An arbitrage opportunity or arbi- 
trage portfolio is a portfolio m = (m,,...,7,) such that either of the following condi- 
tions holds: 

Al. The current price of m is negative, dL, 7A), < 0, and the pay-off at terminal time T is 
nonnegative, i.e., X`; 7;A7(@;) = 0 for all j states. 

A2. The current price of m is zero, i.e., Y~"_, 77,A), = 0, and the pay-off at terminal time T 
in at least one scenario wœ; is positive, i.e., ey T;Ar(w;) > 0 for some jth state, and the 
pay-off at terminal time T is nonnegative. 


Definition 1.4. Market Completeness The financial model M = (Q, A) is complete if for 
all random variables f,: Q —> R, where f, is a bounded payoff function, there exists an asset 
price process or portfolio A, in the basic assets contained in A such that A;(w) = fr(@) for 
all scenarios w € Q. 


This definition essentially states that any pay-off (or state-contingent claim) can be repli- 
cated, i.e., is attainable by means of a portfolio consisting of positions in the set of basic 
assets. If an arbitrage portfolio exists, one says there is arbitrage. The first form of arbitrage 
occurs whenever there exists a trade of negative initial cost at time t = 0 by means of which 
one can form a portfolio that under all scenarios at future time t = T has a nonnegative 
pay-off. The second form of arbitrage occurs whenever one can perform a trade at zero cost 
at an initial time ¢ = 0 and then be assured of a strictly positive payout at future time T under 
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at least one possible scenario, with no possible downside. In reality, in either case investors 
would want to perform arbitrage trades and take arbitrarily large positions in the arbitrage 
portfolios. The existence of these trades, however, infringes on the modeling assumption of 
infinite liquidity, because market prices would shift as a consequence of these large trades 
having been placed. 

Let’s start by considering the simplest case of a single-period economy consisting of only 
two hedging instruments (i.e., n = 2 basic assets) with price processes A} = B, and A? = S,. 
The scenario set, or sample space, is assumed to consist of only two possible states of the 
world: Q = {w,, w_}. S, is the price of a risky asset, which can be thought of as a stock 
price. The riskless asset is a zero-coupon bond, defined as a process B, that is known to be 
worth the so-called nominal amount Br = N at time T while at time t = 0 has worth 


By = (14+rT)'N. (1.5) 


Here r > 0 is called the interest rate. As is discussed in more detail in Chapter 2, interest 
rates can be defined with a number of different compounding rules; the definition chosen here 
for r corresponds to selecting T itself as the compounding interval, with simple (or discrete) 
compounding assumed. At current time t = 0, the stock has known worth Sọ. At a later 
time ft = T, two scenarios are possible for the stock. If the scenario w, occurs, then there 
is an upward move and Sr = S;(w,) = S}; if the scenario w_ occurs, there is a downward 
move and Sr = S;(w_) = S_, where S, > S_. Since the bond is riskless we have B;(w,.) = 
B;(w_) = Br. Assume that the real-world probabilities that these events will occur are p, = 
p € (0,1) and p_ = (1 — p), respectively. 

Figure 1.1 illustrates this simple economy. In this situation, the hypothesis of arbitrage 
freedom demands that the following strict inequality be satisfied: 


S S 


— < 6< $ 
14+7T 14+7T 





(1.6) 


In fact, if, for instance, one had Sọ < = then one could make unbounded riskless profits by 


initially borrowing an arbitrary amount of money and buying an arbitrary number of shares 
in the stock at price Sọ at time t = 0, followed by selling the stock at time t = T at a higher 
return level than r. Inequality (1.6) is an example of a restriction resulting from the condition 
of absence of arbitrage, which is defined in more detail later. 

A derivative asset, of worth A, at time t, is a claim whose pay-off is contingent on future 
values of risky underlying assets. In this simple economy the underlying asset is the stock. 
An example is a derivative that pays f, dollars if the stock is worth $,, and f_ otherwise, at 
final time T: A; = A;(w,) =f, if S7 = S} and Ay = A;(w_) = f_ if Sp = S_. Assuming one 
can take fractional positions, this payout can be statically replicated by means of a portfolio 


S, 
Pa 
So 


p- s 


FIGURE 1.1 A single-period model with two possible future prices for an asset S. 
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consisting of a shares of the stock and b bonds such that the following replication conditions 
under the two scenarios are satisfied: 


aS_+bN=f., (1.7) 
aS, + bN = f,. (1.8) 


The solution to this system is 


Pa he pa fS hs- 
~ S-S ~ NS,-S_) ` 


a 





(1.9) 


The price of the replicating portfolio, with pay-off identical to that of the derivative, must be 
the price of the derivative asset; otherwise there would be an arbitrage opportunity. That is, 
one could make unlimited riskless profits by buying (or selling) the derivative asset and, at 
the same time, taking a short (or long) position in the portfolio at time t= 0. At time t = 0, 
the arbitrage-free price of the derivative asset, Ap, is then 


Ap = aSyt+b(1+rT)'N 


_ Sy) —-CU+rT)'S_ (1+rT)'S, —Sy 
-( cs. Jarl 5, —s re (1.10) 








Dimensional considerations are often useful to understand the structure of pricing formulas 
and detect errors. It is important to remember that prices at different moments in calendar 
time are not equivalent and that they are related by discount factors. The hedge ratios a and 
b in equation (1.9) are dimensionless because they are expressed in terms of ratios of prices 
at time T. In equation (1.10) the variables f, and S, —S_ are measured in dollars at time T, 
so their ratio is dimensionless. Both Sọ and the discounted prices (1+ r7T)~'S, are measured 
in dollars at time 0, as is also the derivative price Ag. 

Rewriting this last equation as 


y= arni (HPE) (E (1.11) 








shows that price Aj can be interpreted as the discounted expected pay-off. However, the 
probability measure is not the real-world one (i.e., not the physical measure P) with probabil- 
ities p, for up and down moves in the stock price. Rather, current price Ag is the discounted 
expectation of future prices A+, in the following sense: 





Ay = (1+ rT) E%[Ar] = (1+rT)"'[q,A7(@,) +g_Ap(@_)] (1.12) 





under the measure Q with probabilities (strictly between 0 and 1) 


1 So— S S,- (1 S 
q, = ( +D x = = an < ED e (1.13) 
+79- +79- 
qd, +q_=1. The measure Q is called the pricing measure. Pricing measures also have 


other, more specific names. In the particular case at hand, since we are discounting with a 
constant interest rate within the time interval [0, T], Q is commonly named the risk-neutral 
or risk-adjusted probability measure, where q, are so-called risk-neutral (or risk-adjusted) 


probabilities. Later we shall see that this measure is also the forward measure, where the 
bond price B, is used as numeraire asset. In particular, by expressing all asset prices relative 
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to (i.e., in units of) the bond price A‘/B,, with Bp = N, regardless of the scenario and 
B,/B; = (1+ rT)!, we can hence recast the foregoing expectation as: Ay = B, E2[A;/B,]. 
Hence Q corresponds to the forward measure. We can also use as numeraire a discretely 
compounded money-market account having value (1+ rt) (or (1+ rt)N). By expressing all 
asset prices relative to this quantity, it is trivially seen that the corresponding measure is the 
same as the forward measure in this simple model. As discussed later, the name risk-neutral 
measure shall, however, refer to the case in which the money-market account (to be defined 
more generally later in this chapter) is used as numeraire, and this measure generally differs 
from the forward measure for more complex financial models. 

Later in this chapter, when we cover pricing in continuous time, we will be more specific 
in defining the terminology needed for pricing under general choices of numeraire asset. We 
will also see that what we just unveiled in this particularly simple case is a general and 
far-reaching property: Arbitrage-free prices can be expressed as discounted expectations of 
future pay-offs. More generally, we will demonstrate that asset prices can be expressed in 
terms of expectations of relative asset price processes. A pricing measure is then a martingale 
measure, under which all relative asset price processes (i.e., relative to a given choice of 
numeraire asset) are so-called martingales. Since our primary focus is on continuous-time 
pricing models, as introduced later in this chapter, we shall begin to explicitly cover some 
of the essential elements of martingales in the context of stochastic calculus and continuous- 
time pricing. For a more complete and elaborate mathematical construction of the martingale 
framework in the case of discrete-time finite financial models, however, we refer the reader 
to other literature (for example, see [Pli97, MM03]). 

We now extend the pricing formula of equation (1.12) to the case of n assets and m 
possible scenarios. 


Definition 1.5. Pricing Measure A probability measure Q = (q, . . - , dm) 9 < q; < 1, for 
the scenario set Q = {@,,...,,,} is a pricing measure if asset prices can be expressed as 
follows: 


A, = a E?[A;] =a $ q;A;(0;) (1.14) 
j=1 
foralli=1,...,n and some real number a > 0. The constant a is called the discount factor. 


Theorem 1.1. Fundamental Theorem of Asset Pricing (Discrete, single-period case) 
Assume that all scenarios in Q are possible. Then the following statements hold true: 


e There is no arbitrage if and only if there is a pricing measure for which all scenarios 
are possible. 
e The financial model is complete, with no arbitrage if and only if the pricing measure 


is unique. 
Proof. First, we prove that if a pricing measure Q = (q,,..., qm) exists and prices Aj, = 
a E®[A;] for all i=1,...,n, then there is no arbitrage. If X; 7;A,(;) > 0, for all w; € Q, 


then from equation (1.14) we must have X`; 7Ai, > 0. If X; 7,Ai, = 0, then from equation 
(1.14) we cannot satisfy the payoff conditions in (A2) of Definition 1.3. Hence there is no 
arbitrage, for any choice of portfolio m € R”. 

On the other hand, assume that there is no arbitrage. The possible price-payoff (m + 1)- 
tuples 


P= | (Erat Semon. = Darl). zer (1.15) 
i=l i=l i=l 
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make up a plane in R x R”. Since there is no arbitrage, the plane P intersects the octant 
R, x R} made up of vectors of nonnegative coordinates only in the origin. Let N be the set 
of all vectors (—B, Y1, - - - , Ym) normal to the plane P and normalized so that B > 0. Vectors 
in N satisfy the normality condition 


-8( Xm) +Ey( Laie) 26 (1.16) 


for all portfolios 7. 
Next we obtain two Lemmas to complete the proof. 


Lemma 1.1. Suppose the financial model on the scenario set Q and with instruments 


(A!,..., A”) is arbitrage free and let m be the dimension of the linear space P. If the matrix 
rank dim P < m, then one can define l = (m— dim P) price-payoff tuples (— Bj, B4 (w)), k = 
1,...,1, so that the extended financial model with basic assets (A',..., A", B',..., B’) 


and scenario set © is complete and arbitrage free. 


Proof. The price-payoff tuples (—Bi, B% (w1), . . . , B4 (w,)) can be found iteratively. Suppose 
that / = m—dimP > 0. Then the complement to the linear space P has dimension /+ 1 > 2. 
Let X = (—X§, X}(@)) and Y = (—Y;, Yf(@)) be two vectors orthogonal to each other and 
orthogonal to P. Then there is an angle @ such that the vector B! = cos @X + sin OY has at 
least one strictly positive coordinate and one strictly negative coordinate, i.e., B' ¢ Rx R,. 
Hence the financial model with instruments (A',..., A”, B!) is arbitrage free. Iterating the 
argument, one can complete the market while retaining arbitrage freedom. UO 


Lemma 1.2. If markets are complete, the space N orthogonal to P is spanned by a vector 


(B, Yis- - -> Ym) lying in the main octant B = R, x R} of vectors with strictly positive 
coordinates. 
Proof. In fact if 8 = 0, then P contains the line (x, 0,...,0) and all positive payouts would 


be possible, even for an empty portfolio, which is absurd. It is also absurd that y, = 0, Yj. 
In fact, in this case, since markets are complete, there is an instrument paying one dollar in 
case the scenario œw; occurs and zero otherwise, and since y; = 0, the price of this instrument 
at time ¢ = 0 is zero, which is absurd. O 


If markets are not complete, one can still conclude that the set N contains a vector 
(B, ¥1,-+->¥Y%m) With strictly positive coordinates. In fact, thanks to Lemma 1.1, one can 
complete it while preserving arbitrage freedom by introducing auxiliary assets and the normal 
vector can be chosen to have positive coordinates. Hence, in all cases of 7, values, according 
to equation (1.16) we have 


Ai =a E?[A}] =a} q;Ar(w;), (1.17) 


j=! 


where Q is the measure with probabilities 





Yj 
q; = 5 (1.18) 
: ee. 
and discount factor 
= By; (1.19) 
j=l 
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The first project of Part II of this book is a study on single-period arbitrage. We refer the 
interested reader to that project for a more detailed and practical exposition of the foregoing 
theory. In particular, the project provides an explicit discussion of a numerical linear algebra 
implementation for detecting arbitrage in single-period, finite financial models. 


Problems 


Problem 1. Consider the simple example in Figure 1.1 and assume the interest rate is r. 
Under what condition is there no arbitrage in the model? 


Problem 2. Compute E2[S,] within the single-period two-state model. Explain your result. 


Problem 3. Let p? denote the current price Ao of the ith security and denote by D,; = A$ (w;) 
the matrix elements of the n x m dividend matrix with i=1,...,n, j=1,...,m. Using 
equation (1.14) with a = (1+ rT)~! show that the risk-neutral expected return on any security 
A’ is given by the risk-free interest rate 


Ai — Ai m D;; 
zj T J -2al — i) = rT, (1.20) 
0 


j=l i 





L 


where q; are the risk-neutral probabilities. 


Problem 4. State the explicit matrix condition for market completeness in the single-period 
two-state model with the two basic assets as the riskless bond and the stock. Under what 
condition is this market complete? 


Problem 5. Arrow—Debreu securities are claims with unit pay-offs in only one state of the 
world. Assuming a single-period two-state economy, these claims are denoted by E, and 
defined by 





fl, ifw=o, (0, ifw=o, 
E,(@)= >  E_(o)= 
0, if w= w_ 1, if w= w_ 


(a) Find exact replicating portfolios m, = (a,,b,) and w_ = (a_,b_) for E, and E_, 
respectively. The coefficients a and b are positions in the stock and the riskless bond, 
respectively. 

(b) Letting F; represent an arbitrary pay-off, find the unique portfolio of Arrow—Debreu 
securities that replicates F}. 





1.2 Continuous State Spaces 


This section, together with the next section, presents a review of basic elements of probability 
theory for random variables that can take on a continuum of values while emphasizing some 
of the financial interpretation of mathematical concepts. 

Modern probability theory is based on measure theory. Referring the reader to textbook 
literature for more detailed and exhaustive formal treatments, we will just simply recall here 
that measure theory deals with the definition of measurable sets D, probability measures p, 
and integrable functions f : D —> R for which one can evaluate expectations as integrals 


E[fl= Í fx)p(dx). (1.21) 
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In finance, one typically deals with situations where the measurable set D C R“, with integer 
d > 1. Realizations of the vector variable x € D correspond to scenarios for the risk factors 
or random variables in a financial model. 

Future asset prices are real-valued functions of underlying risk factors f(x) defined for 
x € D and hence themselves define random variables. Probability measures (dx) are often 
defined as (dx) = p(x)dx, where p(x) is a real-valued continuous probability distribution 
function that is nonnegative and integrates to 1; i.e., 


p(x) > 0, Í p(x)dx = 1. (1.22) 


The expectation E” [ f ] of f under the probability measure with p as density is defined by the 
d-dimensional integral 


"Tr = Í f(x) p(x)dx. (1.23) 


The pair (D, u(dx)) is called a probability space. 

In particular, this formalism can also allow for the case of a finite scenario set of vectors 
D = {x,...,x}, as was considered in the previous section. In this case the probability 
distribution is a sum of Dirac delta functions, 


p(x) = pêa"), (1.24) 


i=1 


As further discussed shortly, a delta function can be thought of as a singular function that 
is positive, integrates to 1 over all space, and corresponds to the infinite limiting case 
of a sequence of integrable functions with support only at the origin. Probabilistically, 
a distribution, such as equation (1.24), which is a sum of delta functions, corresponds to 


a situation where only the scenarios x™®,...,x® can possibly occur, and they do with 
probabilities p,,..., py. These probabilities must be positive and add up to 1; i.e., 
N 
es. (1.25) 
i=l 


In the case of a finite scenario set (i.e., a finite set of possible events with finite integer N), the 
random variable f = f(x) is a function defined on the set of scenarios D, and its expectation 
under the measure with p as density is given by the finite sum 


E'I = Do pfx). (1.26) 


i=1 


For an infinitely countable set of scenarios, then, the preceding expressions must be considered 
in the limit N — oo. Hence in the case of a discrete set of scenarios (as opposed to a 
continuum) the probability density function collapses into the usual probability mass function, 
as occurs in standard probability theory of discrete-valued random variables. 

The Dirac delta function is not an ordinary function in R° but, rather, a so-called dis- 
tribution. Mathematically, a distribution is defined through its value when integrated against 
a smooth function. One can regard 6(x —x’), x, x’ € Rf, as the limit of an infinitesimally 
narrow d-dimensional normal distribution: 

[x—x']? 


Í, S(xX)d(x— nS a aa coe flxyexp ( — = Jax = fix) (1.27) 
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For example, in one dimension a representation of the delta function is 


5(x—x’) = lim Pi aa (1.28) 


o> 9/27 





Events are modeled as subsets G C D for which one can compute the integral that gives 
the expectation E’[1,]. The function 1¢(x) denotes the random variable equal to 1 for x € G 
and to zero otherwise; 1,(x) is called the indicator function of the set G. This expectation is 
interpreted as the probability P(G) that event G C D will occur; i.e., 


P(G) = Ele] = f 1c) px)dx = | p(x)ax. (1.29) 
Ri G 
Examples of events are subsets, e.g., such as 
G={xeD: a< f(x) < b}, (1.30) 


with b > a and where f is some function. An important concept associated with events is 
that of conditional expectation. Given a random variable f, the expectation of f conditioned 
to knowing that event G will occur is 


E"[f le] 


AG) (1.31) 


E"[f|G] = 


Two probability measures ù(dx) = p(x)dx and (dx) = p(x)dx are said to be equivalent 
(or absolutely continuous with respect to one another) if they share the same sets of null 
probability; i.e., ù ~ u if the probability condition P(G) > 0 implies P(G) > 0, where 


PG) = E'Ile]= f 1e@pdx= f pax, (1.32) 


with E? [] denoting the expectation with respect to the measure ù. When computing the 
expectation of a real-valued random variable, say, of the general form of a function of a 
random vector (such functions are further defined in the next section), f = f(X) : R? —> R, it 
is sometimes useful to switch from one choice of probability measure to another, equivalent 
one. One can use the following change of measure (known as the Radon—Nikodym theorem) 
for computing expectations: 


E'I = f fonao = | so (wad =x re (1.33) 


The nonnegative random variable denoted by a is called the Radon—Nikodym derivative of 
u with respect to ù (or P w.r.t. P). From this result it also follows that te = (#)" and 
EP [Gl = 1. As will be seen later in the chapter, a more general adaptation of this result 
for computing certain types of conditional expectations involving martingales will turn out 
to form one of the basic tools for pricing financial derivatives using changes of numeraire. 
Another particular example of the use of this change-of-measure technique is in the Monte 
Carlo estimation of integrals by so-called importance-sampling methods, as described in 
Chapter 4. 

Just as integrals are approximated with arbitrary accuracy by finite integral sums, contin- 
uous probability distributions can be approximated by discrete ones. For instance, let D C R? 
be a bounded domain and p(x) be a continuous probability density on D and let {G,,..., Gm} 
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be a partition of D made up of a family of nonintersecting events G; C D whose union covers 
the entire state space D and that have the shape of hypercubes. Let p; be the probability of 
event G; under the probability measure with density p(x). Then an approximation for p(x) is 


P(x) = 2 p/8(x—X;), (1.34) 


where x; is the center of the hypercube corresponding to event G;. Let 6 be the volume of the 
largest hypercube among the cubes in the partition {G,,...,G,,} and let f(x) be a random 
variable on D. In the limit 6 — 0, as the partition becomes finer and finer, the number of 
events m(6) will diverge to oo. In this limit, we find 


m(ô) 


E'[/]= lim Ð p:f&). (1.35) 


By using sums as approximations to expectations, which are essentially multidimensional 
Riemann integrals, one can extend the theorem in the previous section to the case of continuous 
probability distributions. Consider a single-period financial model with current (i.e., initial) 
time t = 0 and time horizon t = T and with n basic assets whose current prices are Aj, 
i=1,...,n. The prices of these basic assets at time T are indexed by a continuous state space 
represented by the domain Q C Rf, and the values of the basic assets are random variables 
AŻ (x), with x € Q. That is, the asset prices A‘ are random variables assumed to take on real 
positive values, i.e., Ai: Q —> R,. Let’s denote by p(x)dx the real-world probability measure 
in © and assume that the measure of all open subsets of Q is strictly positive. A portfolio is 
modeled by a vector 77 whose components denote positions or holdings 7,,i=1,...,n, in 
the basic assets. The definition of arbitrage extends as follows. 


Definition 1.6. Nonnegative Portfolio A portfolio is nonnegative if it gives rise to nonneg- 
ative expected pay-offs under almost all events G C © of nonzero probability, i.e., such that 


E| Y a, A-(x)|x eG | > 0. (1.36) 


i=1 


Definition 1.7. Arbitrage: Single-Period Continuous Case The market admits arbitrage if 
either of the following conditions holds: 

Al. There is a nonnegative portfolio 7 of negative initial price )~"_, 7Ai, < 0. 

A2. There is a nonnegative portfolio of zero initial cost, \~’_,7;Ai, = 0, for which the 
expected payoff is strictly positive, i.e., EP pa TA; | > 0. 


Definition 1.8. Pricing Measure: Single-Period Continuous Case? A probability measure 
Q of density q(x)dx on D is a pricing measure if all asset prices at current time t = 0 can 
be expressed as follows: 


Aj = ak®[f]= a | fi(x)q(x)ax (1.37) 
Q 


for some real number a > 0. The constant a is called the discount factor. The functions 
f(x) = Ai (x) are payoff functions for a given state or scenario x. 


3Later we relate such pricing measures to the case of arbitrary choices of numeraire asset wherein the pricing 
formula involves an expectation of asset prices relative to the chosen numeraire asset price. Changes in numeraire 
correspond to changes in the probability measure. 
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Market completeness is defined in a manner similar to that in the single-period discrete 
case of the previous section. From the foregoing definitions of arbitrage and pricing measure 
we then have the following result, whose proof is left as an exercise. 


Theorem 1.2. Fundamental Theorem of Asset Pricing (Continuous Single-Period Case) 
Assume that all scenarios in Q are possible. Then the following statements hold true. 


e There is no arbitrage if and only if there is a pricing measure for which all scenarios 
are possible. 

e If the linear span of the set of basic instruments Ai, i=1,...,n, is complete and 
there is no arbitrage, then there is a unique pricing measure Q consistent with the 
prices Ai, of the reference assets at current time t = 0. 


The single-period pricing formalism can also be extended to the case of a multiperiod 
discrete-time financial model, where trading is allowed to take place at a finite number of 
intermediate dates. This feature gives rise to dynamic trading strategies, with portfolios in 
the basic assets being rebalanced at discrete points in time. The foregoing definitions and 
notions of arbitrage and asset pricing must then be modified and extended substantially. 
Rather than present the theory for such discrete-time models, we shall instead introduce more 
important theoretical tools in the following sections that will allow us ultimately to consider 
continuous-time financial models. Multiperiod discrete-time (continuous-state-space) models 
can then be obtained, if desired, as special cases of the continuous models via a discretization 
of time. A further discretization of the state space leads to discrete-time multiperiod finite 
financial models. 


1.3 Multivariate Continuous Distributions: Basic Tools 


Marginal probability distributions arise, for instance, when one is computing expectations 
on some reduced subspace of random variables. Consider, for example, a set of continuous 
random variables that can be separated or grouped into two random vector spaces X = 
(X,,...,X,,) and Y=(Y,,..., Y,_,,) that can take on values x = (x,,...,,,) € R” and 
Y= (,---sYn-m) E R™”, respectively, with 1 < m < n, n> 2. The function p(x, y) is 
the joint probability density or probability distribution function (pdf) in the product space 
R” = R” x R”™. The integral 


Ply) = Ly P(x, y)dx (1.38) 


defines a marginal density p,(y). This function describes a probability density in the subspace 
of random vectors Y e R”=™ and integrates to unity over R”~”. The conditional density 
function, denoted by p(x|Y = y) = p(xļ|y) for the random vector X, is defined on the subspace 
of R” (for a given vector value Y = y) and is defined by the ratio of the joint probability 
density function and the marginal density function for the random vector Y evaluated at y: 


P(x y) 
PpO)’ 


assuming p,(y) #0. From the foregoing two relations it is simple to see that, for any given y, 
the conditional density also integrates to unity over x € R”. 





p(xly) = (1.39) 
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Conditional distributions play an important role in finance and pricing theory. As we 
will see later, derivative instruments can be priced by computing conditional expectations. 
Assuming a conditional distribution, the conditional expectation of a continuous random 
variable g = g(X, Y), given Y =y, is defined by 


E[sl¥ =y] = f e(x.y)ptaly)dx. (1.40) 


Given any two continuous random variables X and Y, then E[X|Y = y] is a number while 
E[X|Y] is itself a random variable as Y is random, i.e., has not been fixed. We then have the 
following property that relates unconditional and conditional expectations: 


E(X] = E[ELX|Y] = f ELXIY = ylp, Oday, (1.41) 


This property is useful for computing expectations by conditioning. More generally, for a 
random variable given by the function g = g(X, Y) we have the property 


Elsl= f [spl y)dxdy 
= f [80 ypa] pody 
=|, lsl¥ =ylp, yay = £[ZIs1¥]]. (1.42) 


Functions of random variables, such as g(X, Y), are of course also random variables. In 
general, the pdf of a random variable given by a mapping f = f(X) : R” — R is the function 
p,:R->R, 


P(f(X) € [E € +88) 
a) (68) 





p= (1.43) 


defined on some open or closed interval between a and b. This interval may be finite or 
infinite; some examples are é € [0, 1], [0,00), and (—oo, 00). The cumulative distribution 
function (cdf) C, for the random variable f is defined as 


O= f ped (1.44) 


and gives the probability P(a < f < z), with C,(b) = 1. Let us consider another independent 
real-valued random variable g € (c, d), where (c,d) is generally any other interval. We recall 
that any two random variables f and g are independent if the joint pdf (or cdf) of f and g is 
given by the product of the respective marginal pdfs (or cdfs). The sum of two independent 
random variables f and g is again a random variable h = f + g. The cumulative distribution 
function, denoted by C,, for the random variable h is given by the convolution integral 


C= |f pEr mdëan 


ét+nst 


b d 
=f PECK- Ode =f peMC(E—nan, (1.45) 
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where p, and C, are the density and cumulative distribution functions, respectively, for the 
random variable g. By differentiating the cumulative distribution function we find the density 
function for the variable h: 


b d 
PA =f Pp(@PelE— Ode =f pp E- n)an. (1.46) 


The preceding formulas are sometimes useful because they provide the cumulative (or density) 
functions for a sum of two independent random variables as convolution integrals of the 
separate density and cumulative functions. 

The definition for cumulative distribution functions extends into the multivariate case in 
the obvious manner. Given a pdf p : R” — R for R”-valued random vectors X = (X,,...,X,), 
the corresponding cdf is the function C, : R” — R defined by the joint probability 





C,(x) = P(X, Sienka] of ' p(x')dx’. (1.47) 


We recall that any two random variables X, and X, (i # j) are independent if the joint 
probability P(X; < a, X, < b) = P(X; < a)P(X; < b) for all a, b € R, i.e., if the events {X; < a} 
and {X; < b} are independent. Hence, for two independent random variables the joint cdf 
and joint pdf are equal to the product of the marginal cdf and marginal pdf, respectively: 
P(X;, Xj) = pixi) Pj (xj) and Cp (xi xj) = C(x) C;(x)). 

Another useful formula for multivariate distributions is the relationship between probabil- 
ity densities (within the same probability measure, say, u(dx)) expressed on different variable 
spaces or coordinate variables. That is, if p(x) and p(X) represent probability densities on 
n-dimensional real-valued vector spaces x and x, respectively and the two spaces are related 
by a one-to-one continuously differentiable mapping x = x(x), then 


(1.48) 


p(x) = nA E 








where a is the Jacobian matrix of the invertible transformation x —> x. The notation |M| 
refers to the determinant of a matrix M. 

A probability distribution that plays a distinguished role is the n-dimensional Gaussian 
(or normal) distribution, with mean (or average) vector w= (M1, . . . , Wn), defined on x € R” 
as follows: 





plo ©) = exp (= F(x n)-C"'-(x—p)). (1.49) 
v(277)"|C| 2 

The shorthand notation x ~ N,,(j, C) is also used to denote the values of an n-dimensional 
random vector with components x,,..., x, that are obtained by sampling with distribution 
p(x; m, C). C = (C;;) is called covariance matrix and enjoys the property of being positive 
definite, i.e., is such that the inner product (x, Cx) = x - (Cx) > 0 for all real vectors x, and 
Ci; = C; It follows that the cdf of the n-dimensional multivariate normal random vector is 
defined by the n-dimensional Gaussian integral 


(x a O= f [pie Caw. (1.50) 


A particularly important special case of equation (1.50) for n = 1 is the univariate standard 
normal cdf (i.e., B(x; 0, 1)), defined by 


1 x 
N(x) = = ey dy, (1.51) 
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The mean of a random vector X with given pdf p(x), is defined by the components 


m=E[X]= [pdx = xpi(x)dx, (1.52) 


and the covariance matrix elements are defined by the expectations 


Ci; = Cov(X;, Xj) = EL (X; = pi) (Xj- n;)] = I, (x; — M;) (Xj — Hj) P(X) dx, (1.53) 


for all i, j= 1,...,n. The standard deviation of the random variable X, is defined as the 
square root of the variance: 


o; = y Var(X;) = y E[(X; —p;)"], (1.54) 


and the correlation between two random variables X, and X; is defined as follows: 


er 
pi; = Corr(X;, X,) = aa (1.55) 


| 


Since „yC; = 9;, the correlation matrix has a unit diagonal, i.e., p; = 1. As well, they obey 
the inequality |p;;| < 1 (see Problem 1 of this section). For random variables that may be 
positively or negatively correlated (e.g., as is the case for different stock returns) it follows that 


-1 <p; < l. (1.56) 


In the particular case of a multivariate normal distribution with positive definite covariance 
matrix as in equation (1.49), the strict inequalities —1 < p;; < 1 hold. 

The main property of normal distributions is that the convolution of two normal distribu- 
tions is also normal. A random variable that is a sum of random normal variables is, therefore, 
also normally distributed (see Problem 2). Because of this property, multivariate normal 
distributions can be regarded as affine transformations of standard normal distributions with 
a = 0,1 and C=I,,,, (the identity matrix). Consider the vector é = (€,,..., €,) of inde- 
pendent standard normal variables with zero mean and unit covariance, i.e., with probability 
density 


n ee /2 


p(é)=T] 


Na (1.57) 


If L = (L,;), is an n-dimensional matrix, then the random vector X = w+ Lé is normally 
distributed with mean p and covariance C = LL‘, + = matrix transpose. Indeed, taking 
expectations over the components gives 


E[X;] =| w+ 5L] = fj» (1.58) 


j= 


and 


E[(X;— ;)(X;— 4) =E (È tat) (3448) = y Ei = C;;. (1.59) 
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Here we have used E[€;] = 0 and E[€;€;] = 6 
if i = and zero otherwise. 

Conversely, given a positive definite matrix C, one can show that there is a lower triangular 
matrix L = (L;;) with L;, =0 if j > i, such that C= LL’. The matrix L can be evaluated with 
a procedure known as Cholesky factorization. As discussed later in the book, this algorithm is 
at the basis of Monte Carlo methods for generating scenarios obeying a multivariate normal 
distribution with a given covariance matrix. 

A special case of a multivariate normal is the bivariate distribution defined for x = 


ij, Where ô;; is Kronecker’s delta, with value 1 








2. 
(x1, X2) E€ R*: 
1 Giza? (a=)? 2p ©1241) @2=m) 
2(1-p2) 2 g po %, 
P(X), X23 Mis M2, 01, O2, P) = 
2 
2770 0y 1 — p 


The parameters u; and o; > 0 are the mean and the standard deviation of X, i= 1,2, 
respectively, and p (—1 < p < 1) is the correlation between X, and X,, i.e., p = p = 
Ci2/0103. In this case the covariance matrix is 


2 
c=( ve a), (1.60) 


2 
poy 07 


and the lower Cholesky factorization of C is given by 


[ù 0 
= & PR er l (Pe 


The correlation matrix is simply 


_(! P 
e=(; J: (1.62) 


with Cholesky factorization p = AA‘, 


a=(i =e): (1.63) 


The covariance matrix has inverse 


ee oa d 1/0? —p/ 00 
a (1—p?) es 1/0} ’). om 


Conditional and marginal densities of the bivariate distribution are readily obtained by inte- 
grating over one of the variables in the foregoing joint density (see Problem 3). 

For multivariate normal distributions one has the following general result, which we state 
without proof. 





Proposition. Consider the random vector X € R” with partition X = (X,, X,), X, € R”, 
X, € R*™™” with 1<m<n, n>2. Let X~N,(p,C) with mean p= (py, p) and nxn 


covariance 
C C 
C= 11 I 
© Cy 
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with nonzero determinant |C,| 40, where C,, and C,, are mx m and (n— m) x (n— m) 
covariance matrices of X, and X,, respectively, and C,, = Ci is the m x (n — m) cross- 
covariance matrix of the two subspace vectors. The conditional distribution of X,, given 
X, =X), is the m-dimensional normal density with mean i = m, + Cp C3 (X, — My) and 
covariance C = C,, — Cp C3 C, Le., x; ~ N(M, C) conditional on X, = x). 


A relatively simple proof of this result follows by application of known identities for 
partitioned matrices. This result is useful in manipulating multidimensional integrals involving 
normal distributions. 

In deriving analytical properties associated with expectations or conditional expectations 
of random variables, the concept of a characteristic function is useful. Given a pdf p : R” > R 
for a continuous random vector X = (X,,...,X,,), the (joint) characteristic function is the 
function dy : R” > R defined by 


pxu) = E[e"*] = Í _e*p(x)dx, (1.65) 


where u = (u,,...,u,) € R”, i= V—I. Since dx is the Fourier transform of p, then from 
the theory of Fourier integral transforms we know that the characteristic function gives a 
complete characterization of the probabilitic laws of X, equivalently as p does. That is, 
any two random variables having the same characteristic function are identically distributed; 
i.e., the characteristic function uniquely determines the distribution. From the definition we 
observe that @y is always a well-defined continuous function, given that p is a bonafide 
distribution. Evaluating at the origin gives ¢y(0) = E[1] = 1. The existence of derivatives 
oO bx (0)/duk, k > 1 is dependent upon the existence of the respective moments of the random 
variables X;. The kth moment of a single random variable X € R is defined by 


m, = E[X*] = D x*p(x)dx, (1.66) 


—0o 


while the kth centered moment is defined by 


nË = E(X- m= f -wpd (1.67) 


b= E[X], k > 1. [Note: for X = X; then p —> p; is the ith marginal pdf, u —> u; = E[X;], 
poo u® = E[(X;— u;)*], etc.] From these integrals we thus see that the existence of the 
moments depends on the decay behavior of p at the limits x + too. For instance, a distribution 
that exhibits asymptotic decay at least as fast as a decaying exponential has finite moments 
to all orders. Obvious examples of these include the distributions of normal, exponential, 
and uniform random variables. In contrast, distributions that decay as some polynomial to a 
negative power may, at most, only possess a number of finite moments. A classic case is the 
Student t distribution with integer d degrees of freedom, which can be shown to possess only 
moments up to order d. This distribution is discussed in Chapter 4 with respect to modeling 
risk-factor return distributions. 

The moments can be obtained from the derivatives of dy at the origin. However, it is 
a little more convenient to work directly with the moment-generating function (mgf). The 
(joint) moment-generating function is given by 





My(u) = Efe"*] = [ _ et p(x)dx. (1.68) 
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If the mgf exists (which is not always true), then it is related to the characteristic function: 
Mx(u) = ¢x(—iu). It can be shown that if E[|X|"] < œ, then My(u) (and dy(u)) has 
continuous rth derivative at u = 0 with moments given by 


d‘ hy (0) 
duk ° 


nend e n 


a k=1,...,r. (1.69) 





Hence, a random variable X has finite moments of all orders when My(u) (or dy(u)) is 
continuously differentiable to any order with m, = M © (0) = (—i)* a? (0) ,k=1,.... 

Given two independent random variables X and Y, the characteristic function of the 
sum X + Y simplifies into a product of functions: dy, y(u) = Ele“@t”] = Ele ]E[e"’] = 
y(u)fy(u). Hence for Z = X; X; we have $7(u) = Hi- hy, (u) if all X; are independent. 
Characteristic functions or mgfs can be obtained in analytically closed form for various 
common distributions. 


Problems 
Problem 1. Make use of equations (1.53) and (1.54) and the Schwarz inequality, 


Gs f (etwas), z ( E oax) ( iz («(x))*4x), xeR’, (1.70) 


to demonstrate the inequality |C;;| < o;o}, hence |p,;| < 1. 


Problem 2. Consider two independent normal random variables X and Y with probability 
distributions 








ewe and op, (y) = C—O My)? 205 (1.71) 


ON 27 ON 277 


x y 


P(x) = 


respectively. Use convolution to show that Z = X + Y is a normal random variable with 
probability distribution 


1 2 2 

—(z—p,)*/20: 
z) = —~—e t E 1.72 
D(z) — ( ) 


WV 2T 


where o7 = o} +0; and u, = My + Hy. 
Problem 3. Show that the joint density function for the bivariate normal has the form 


1 


2770, 05,/ 1 — p? 


op mera 6 Mi pao m) | (1.73) 


P(X, Y; Hi» M2» Ci, O2, P) = e7 Om) No 





and thereby obtain the marginal and conditional distributions: 


1 
Te 
l E hipag w] | (1.75) 


1 
ex 
J27(1—p?)o, P| 2(1— p?)o; 


Verify that this same result follows as a special case of the foregoing proposition. 





py(Y) = eo Eoin) o, (1.74) 





p(x|¥Y) = 
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Problem 4. Find the moment-generating function for the following distributions: 


(a) The uniform distribution on the interval (a,b) with pdf: p(x) = (b — 2) ig et b: 
(b) The exponential distribution with parameter A > 0 and pdf: p(x) = Ae™ 1 >o: 
(c) The gamma distribution with parameters (n, A), n = 1,2,..., A > 0, and pdf: p(x) = 
he (Axy"! 
a= ho 
By differentiating the mgf, obtain the mean and variance of the random variable X for each 
distribution (a)-(c). 


Problem 5. Obtain the moment-generating function for: 
(a) The multivariate normal with density given by equation (1.49). 


(b) The chi-squared random variable with n degrees of freedom: Y = Xi; Z?, where 
Z; ~ N(O, 1). 


Problem 6. Rederive the result in problem 2 using an argument based solely on moment- 
generating functions. 


Problem 7. Consider two independent exponential random variables X, and X, with respec- 
tive parameters A, and A,, A, Æ A. Find the pdf for X; + X, and the probability P(X, < X,). 
Hint: Use convolution and conditioning, respectively. 
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A particularly important example of a multivariate normal distribution is provided by a random 
path evaluated at a sequence of dates in the future. Consider a time interval [0, t] = [tọ = 


0,t,,...,ty = t], and subdivide it into N > 1 subintervals [f;, t;,,] of length ôt; = t;,, — ti, 
i=0,...,N—1. The path points (t, x,) are defined for all tf = t; by means of the recurrence 
relation 

Xi) 5 Xy + M(t) Ot, + O(¢,)OW,,, (1.76) 


i+] 


where the increments 6W,, = W,,,, — W, are assumed uncorrelated (independent) normal 
random variables with probability density at OW, = dw;: 





1 
p:(ôw;) = e7 00)?/28u, (1.77) 
J 2776; 
Since the increments are assumed independent, the joint pdf for all increments is 
N-1 
p(Swo,..., dwy_1) = [| p;(Sw,). (1.78) 


i=0 
This gives rise to two important unconditional expectations: 
E[SW,,6W,,] = 5,61;  E[6W,,] =0. (1.79) 


By usual convention we fix Wọ = 0. The joint pdf for the random variables W,,..., Wp 


representing the probability density at the path points W, = w; (wọ = 0) is then also a 
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multivariate Gaussian function, which is obtained by simply setting Ôw; = w,,, — w; in 
equation (1.78). The set of real-valued random variables (W,,);-9,.._v therefore represents the 
time-discretized standard Brownian motion (or Wiener process) at arbitrary discrete points 
in time. Iterating equation (1.76) gives 


x, =a) + Ð [u )8t + 0(4,)8W, ], (1.80) 


j=0 


where x, =x, and x, = Xo. The random variable x, is normal with mean 


Fol] = x+ Y (tôt; (1.81) 
and variance 
Eo[(x, — Eo[x,])?] = Eo (x xa = z a(t) ôt; (1.82) 


Note: We use E|] to denote the expectation conditional only on the value of paths being 
fixed at initial time; i.e., x,, = Xọ = fixed value. This is hence an unconditional expectation 
with respect to path values at any later time ¢ > 0. Later, we will at times simply use the 
unconditional expectation E| ] to denote E>[]. Sample paths of a process with zero mean and 
constant volatility are displayed in Figure 1.2. 

Typical stochastic processes in finance are meaningful if time is discretized. The choice 
of the elementary unit of time is part of the modeling assumptions and depends on the 
applications at hand. In pricing theory, the natural elementary unit is often one day but 
can also be one week, one month as well as five minutes or one tick, depending on the 
objective. The mathematical theory, however, simplifies in the continuous-time limit, where 
the elementary time is infinitesimal with respect to the other time units in the problem, such as 
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FIGURE 1.2 A simulation of five stochastic paths using equation (1.76), with x) = 10, constant 
p(t) = 0.1, o(t) = 0.2, N = 100, and time steps ôt; = 0.01. 
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option maturities and cash flow periods. Mathematically, one can construct continuous-time 
processes by starting from a sequence of approximating processes defined for discrete-time 
values iôt, i = 0, . . . , N, and then pass to the limit as ôt —> 0. More precisely, one can define 
a continuous-time process in an interval [fp, ty] by subdividing it into N subintervals of equal 
length, defining a discrete time process x” = X,, and then compute the limit 


x, = lim (1.83) 


N- œ 


by assuming that the discrete-time process x is constant over the partition subintervals. 


The elementary increments 6x, = x,15,— X, are random variables that obviously tend to zero 
as ôt — 0, but which are still meaningful in this case. The convention is to denote these 
increments as dx in the limit ôt — 0 and to consider the straight d as a reminder that, at the 
end of the calculations, one is ultimately interested in the limit as ôt > 0.4 

The continuous-time limit is obtained by holding the terminal time t = ty fixed and letting 
N—> œ, i.e., 


Fol] = fim Efx] =x9+ f edt =x) + ACD, (1.84) 

and 
Eole, Eole D? = fim ELG" -ED f o*@dr= ay, (185) 
where we introduced the time-averaged drift u = ū(t) and volatility @ = a(t) over the time 
period [0,¢],¢ € R}. Since x, is normally distributed, we finally arrive at the transition 


probability density for a stochastic path to attain value x, at time ¢, given an initially known 
value x, at time t = 0: 





=u OP) bs 


1 
> 3 t = - 
P(x, Xo ) Vrt exp ( 202t 


This density, therefore, gives the distribution (conditional on a starting value xo) for a process 
with continuous motion on the entire real line x, € (—oo, oo) with constant drift and volatility. 
[Note: xo, x, are real numbers (not random) in equation (1.86).] 

A Markov chain is a discrete-time stochastic process such that for all times t € 7 the 
increments X,,5,— X, are random variables independent of x,. A Markov process is the 
continuous-time limit of a Markov chain. The process just introduced provides an example 
of a Markov chain because the increments are independent. 

The probability space for a general discrete-time stochastic process where calendar time 
can take on values fọ < t; <--: < ty is the space of vectors x € RY with an appropriate 
multivariate measure, such as P(dx) = p(x,,...X,)dx, where p is a probability density. By 
considering a process x, only up to an intermediate time t;, i < N, we are essentially restricting 
the information set of possible events or probability space of paths. The family (F) o of all 
reduced (or filtered) probability spaces F, up to time ¢, for all times ¢ > 0, is called filtration. 
One can think of F, as the set of all paths up to time t. A pay-off of a derivative contract 


‘These definitions are admittedly NOT entirely rigorous, but they are meant to allow the reader to quickly 
develop an intuition in case she doesn’t have a formal probability education. In keeping with the purpose of this 
book, our objective is to have the reader learn how to master the essential techniques in stochastic calculus that are 
useful in finance without assuming that she first learn the formal mathematical theory. 
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occuring at time f is a well-defined (measurable) random variable on all the spaces F, with 
t > t but not on the spaces with ¢’ < t. Filtrations are essentially hierarchies of probability 
spaces (or information sets) through which more and more information is revealed to us 
as time progresses; i.e., F, C F, if t < t so that given a time partition tọ < t <- < ty, 
Fa CF, Ce C F,,. We say that a random variable or process is F,-measurable if its value 
is revealed at time ¢. Such a random variable or process is also said to be nonanticipative 
with respect to the filtration or F,-adapted (see later for a definition of nonanticipative 
functions, while a definition of an adapted process is also provided in Section 1.9 in the 
context of continuous-time asset pricing). Conditional expectations with respect to a filtration 
F, represent expectations conditioned on knowing all of the information about the process 
only up to time t. It is customary to use the following shorthand notation for conditional 
probabilities: 


E,[-]=£[-|F,]. (1.87) 


Definition 1.9. Martingale A real-valued F,-adapted continuous-time process (X,) + is said 
to be a P-martingale if the boundedness condition E||x,|| < œ holds for all t > 0 and 


x, = E, [xr], (1.88) 
for0<t<T<o 


This definition implies that the conditional expectation for the value of a martingale 
process at a future time 7, given all previous history up to the current time f (i.e., adapted 
to a filtration F,), is its current time ¢ value. Our best prediction of future values of such 
a process is therefore just the presently observed value. [Note: Although we have used the 
same notation, i.e., x,, this definition generally applies to arbitrary continuous-time processes 
that satisfy the required conditions; the pure Wiener process or standard Brownian motion is 
just a special case.] We remark that the expectation E[] = E?[] and conditional expectation 
E,[] = E?[] are assumed here to be taken with respect to a given probability measure P. 
For ease of notation in what follows we drop the explicit use of the superscript P unless the 
probability measure must be made explicit. If one changes filtration or the probability space 
associated with the process, then the same process may not be a martingale with respect to 
the new probability measure and filtration. However, the reverse also applies, in the sense 
that a process may be converted into a martingale by modifying the probability measure. 

A more general property satisfied by a stochastic process (x,),+9 (regardless of whether 
the process is a martingale or not) is the so-called tower property for s < t < T: 


EJE,|x;||=£,[x-]- (1.89) 


This follows from the basic property of conditional expectations: The expectation of a 
future expectation must be equal to the present expectation or presently forecasted value. 
Another way to see this is that a recursive application of conditional expectations always 
gives the conditional expectation with respect to the smallest information set. In this case 
F, C F, C Fr. A martingale process f, = f(x,, t) can also be specified by considering a 
conditional expectation over some (payoff) function ¢ of an underlying process. In particular, 
consider an underlying process x, starting at time tọ with some value x, and the conditional 
expectation 


f= flr) = Ebr, T)], (1.90) 
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for any tọ < s < t < T, then f, satisfies the martingale property. In fact 


aede E [Elbe 7)]| = E[b(xp, T)] = fle). (1.91) 


The process introduced in equation (1.76) is a martingale in case the drift function u(t) 
is identically zero. In fact, in this case if t; < tj}, we have 


tia [E,, [x] Sy -] (1.92) 
galske] = Xy (1.93) 


E, [x] = E, | 
E,[ 


Bachelier was one of the pioneers of stochastic calculus, and he proposed to use a process 
similar to x, as defined by equation (1.76) in the continuous-time limit to model stock price 
processes. A difficulty with the Bachelier model was that stock prices can attain negative 
values. The problem can be corrected by regarding x, to be the natural logarithm of stock 
prices; this conditional density turns out to be related to (although not equivalent to) the 
risk-neutral density used for pricing derivatives within the Black-Scholes formulation, as is 
seen in Section 1.6, where we take a close look at geometric Brownian motion. The density 
in equation (1.86) leads to Bachelier’s formula for the expectation of the random variable 
(x,—K),, with constant K > 0, where (x), =x if x > 0, (x), =0 if x < 0 (see Problem 9). 
In passing to the continuous-time limit, we have, based on equation (1.86), arrived at an 
expression for the random variable x, in terms of the random variable W, for the standard 
Brownian motion (or Wiener process): 


X,=Xy+et+aow,. (1.94) 


The distribution for the zero-drift random variable (W,),,9, representing the real-valued 
standard Brownian motion (Wiener process) at time t with W,_) = Wọ = 0, is given by 


1 
Pw(w, t) = ee (1.95) 


at W, = w. Note that this is also entirely consistent with the marginal density obtained by 
integrating out all intermediate variables w,,...,Wy_, in the joint pdf of the discretized 
process (W,,);9,...y With w = wy, t = ty. 

According to the distributions given by equations (1.77) and (1.95), one concludes that 
standard Brownian motion (or the Wiener process) is a martingale process characterized by 
independent Gaussian (normal) increments with trajectories [i.e., path points (t, x,)] that are 
continuous in time t > 0: ôW, = W, s, — W, ~ N(0, ôt) (i.e., normally distributed with mean 
zero and variance ôt) and W, s, — W, is independent of W, for ôt >0,0<s<t,0< t< œ. 
Moreover, specializing to the case of zero drift and o = 1 and putting tọ = s, the corresponding 


5The date March 29, 1900, should be considered as the birth date of mathematical finance. On that day, Louis 
Bachelier successfully defended at the Sorbonne his thesis Théorie de la Spéculation. As a work of exceptional 
merit, strongly supported by Henri Poincaré, Bachelier’s supervisor, it was published in Annales Scientifiques de I’ 
Ecole Normale Supérieure, one of the most influential French scientific journals. This model was a breakthrough 
that motivated much of the future work by Kolmogorov and others on the foundations of modern stochastic calculus. 
The stochastic process proposed by Bachelier was independently analyzed by Einstein (1905) and is referred to as 
Brownian motion in the physics literature. It is also referred to as the Wiener—Bachelier process in a book by Feller, 
An Introduction to Probability Theory and Its Applications [Fel71]. However, this terminology didn’t affirm itself, 
and now the process is commonly called the Wiener process. 
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probability distribution given by equation (1.86) with shifted time t > (t — s) then gives the 
well-known property: W, — W, ~ N(0, t — s), W, ~ N(O, t). In fact we have the homogeneity 
property for the increments: W,,,—W, ~ W, — W, = W, ~ N(0, t). In particular, E[W,] = 0 
and E[W?] = t. An additional property is E[W,W,] = min(s, t). This last identity obtains 
from the independence of the increments [i.e., equation (1.79)]. Indeed consider any t; < t;, 
O<i<j<N, then: 


E[ W, W, ] = E[(W, — Wo) ((W,, — W,) + (W, — Wo))] 
= E| (W, — Wo)’ ] = EW] = f. (1.96) 


A similar argument with t; < t; gives t;, while for t; = t; we obviously obtain ¢;. All of these 
properties also follow by taking expectations with respect to the joint pdf for the Wiener paths. 

An important aspect of martingales is whether or not their trajectories or paths are 
continuous in time. Consider any real-valued martingale x,, then 6x, = X,45,—, iS a pro- 
cess corresponding to the change in a path over an arbitrary time difference ôt > 0. From 
equation (1.88), E,[êx,] = 0, so, not surprisingly, the increments of a martingale path are 
unpredictable (irregular), even in the infinitesimal limit ôt > 0. However, the irregularity of 
paths can be either continuous or discontinuous. An example of a martingale with discontinu- 
ous paths is a jump process, where paths are generally right continuous at every point in time 
as a consequence of incorporating jump discontinuities in the process at a random yet count- 
able number of points within a time period. We refer the interested reader to recent works on 
the growing subject of financial modeling with jump processes (see, for example, [CT04]). 
Here and throughout, we focus on continuous diffusion models for asset pricing; hence our 
discussion is centered on continuous martingales (i.e., martingales with continuous paths). 
Let f(t) = x,(w), t > 0, represent a particular realized path indexed by the scenario w, then 
continuity in the usual sense implies that the graph of f(t) against time is continuous for all 
t > 0. Denoting the left and right limits at t by f(t—) = lim, ,,- f(s) and f(t+) = lim, , f(s), 
then f(t) = f(t—) = f(t+). Every Brownian path or any path of a stochastic process generated 
by an underlying Brownian motion displays this property, as can be observed, for example, in 
Figure 1.2. [In contrast, a path of a jump diffusion process would display a similar continuity 
in piecewise time intervals but with the additional feature of vertical jump discontinuities at 
random points in time at which only right continuity holds. If t is a jump time, then f(t—), 
f(t+) both exist, yet f(t—) Æ f(t+) with f(t) = f(t+), where f(t) — f(t—) is the size of the 
jump at time 7.] 

Stochastic continuity refers to continiuty of sample paths of a process (x,),.9 in the 
probabilitistic sense as defined by 7 


lim P(|x, — x,| > €)=0, s,t>0 (1.97) 

sot 
for any e€ > 0. This is readily seen to hold for Brownian motion and for continuous martingales. 
The class of continuous-time martingales that are of interest are so-called continuous square 
integrable martingales, i.e., martingales with finite unconditional variance or finite second 
moment: E[x?] < oo for t > 0. Such processes are closely related to Brownian motion and 
include Brownian motion itself. Further important properties of the paths of a continuous 
square integrable martingale (e.g., Brownian motion) then also follow. Consider again the 
time discretization [0, t] = [fọ =0,t,,...,fy = t] with subintervals [f,, t,,,] and path points 
(t;,x,,). The variation and quadratic variation of the path are, respectively, defined as: 


N-1 


vı = lim v” = lim }, |5x,,| (1.98) 
i=0 
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and 


Vy 


= 1 
N- oo 


N-1 
VV = Jim D (ôx, ), (1.99) 
i=0 


ôx, = X1,,; — Xy. The properties of V, and V, provide two differing measures of how paths 
behave over time and give rise to important implications for stochastic calculus. Since the 
process is generally of nonzero variance, then P(V;’ > 0) = 1 and P(V, > 0) = 1. In particular, 
if we let ôt, = ôt = t/N and consider the case of Brownian motion x, = W,, then by rewriting 


V, we have with probability 1: 


3 1 N-1 5 f 1 N-1 ; 
V, = lim (5 E r) )w= yim (5 L ew,) )»= t. (1.100) 


Here we used the Strong law of large numbers and the fact that the (ô W,,)° are identically 
and independently distributed random variables with common mean of ôt. Based on this 
important property of nonzero quadratic variation, Brownian paths, although continuous, are 
not differentiable. For finite N the variation V is finite. As the number N of increments 
goes to infinity, ôt; > 0 and, from property (1.97), we see that the size of the increments 
approaches zero. The question that arises then is whether V, exists or not. Except for the 
trivial case of a constant martingale, the result is that ve — œ as N > oo; i.e., the variation 
V, is in fact infinite. Without trying to provide any rigorous proof of this here, we simply state 
the usual heuristic and somewhat instructive argument for this fact based on the following 
observation: 


N-1 N-1 


i ôx, | < | max (ls, 0| ye |ôx,| = | max (lds, ve. (1.101) 


i=0 i=0 


Since the quadratic variation V, is greater than zero, taking the limit N — oo on both 
sides of the inequality shows that the right-hand side must have a nonzero limit. Yet from 
equation (1.97) we have max{|6x,,|} —> 0 as N —> oo. Hence we must have that the right-hand 
side is a limit of an indeterminate form (of type 0-00); that is, V, = limy_,,, VA = œ, which 
is what we wanted to show. 

Once we are equipped with a standard Brownian motion and a filtered probability space, 
then the notion of stochastic integration arises by considering the concept of a nonanticipative 
function. Essentially, a (random) function f, is said to be nonanticipative w.r.t. a Brownian 
motion or process W, if its value at any time t > 0 is independent of future information. That 
is, f, is possibly only a function of the history of paths up to time ¢ and time t itself: f, = 
SUW,)o<ser}> t). The value of this function at time ¢ for a particular realization or scenario w 
may be denoted by f,(@). Nonanticipative functions therefore include all deterministic (i.e., 
nonrandom) functions as a special case. Given a continuous nonanticipative function f, that 
satisfies the “nonexplosive” condition 


e| [fas] <o, (1.102) 


the Itô (stochastic) integral is the random variable denoted by 


I(f)= [ taw, < 00 (1.103) 
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and is defined by the limit 


N-1 N-1 
LA) = lim © fW, = lim © f,[W,,,, —W,] (1.104) 
i=0 i=0 


It can be shown that this limit exists for any choice of time partitioning of the interval [0, t]; 
e.g., we can choose ôt; = ôt = t/N. Each term in the sum is given by a random number f, 
[but fixed over the next time increment (t;, t;,,)] times a random Gaussian variable ôW, . 
Because of this, the Itô integral can be thought of as a random walk on increments with 
randomly varying amplitudes. Since f, is nonanticipative, then for each ith step we have 
the conditional expectation for each increment in the sum: E,,[f,,6W,,] = f,,E,,[6W,,] = 0. 
Given nonanticipative functions f, and g,, the following formulas provide us with the first 
and second moments as well as the variance-covariance properties of It6 integrals: 


(i) ELL(/)] = e| Í “fa, = (1.108) 
(ii) ELGE)? = e|( f ‘faw,) | = e| | gas), (1.106) 


(i EPON=E|( S raw )( ['saw,)|=e] [reds]. aion 


Based on the definition of /,(f) and the properties of Brownian increments, it is not difficult 
to obtain these relations. We leave this as an exercise for the reader. Of interest in finance 
are nonanticipative functions of the form f, = f(x,, t), where x, is generally a continuous 
stochastic (price) process (x,),.9. The It6 integral is then of the form 


LN = f fl.saW, (1.108) 


and, assuming that condition (1.102) holds, then properties (i)—(iii) also apply. Another notable 
property is that the Ito integral is a martingale, since E,[I,(f)] = (f), for 0 < t < u. 

The Ito integral leads us into important types of processes and the concept of a stochastic 
differential equation (SDE). In fact the general class of stochastic processes that take the form 
of sums of stochastic integrals are (not surprisingly) known as Ito processes. It is of interest 
to consider nonanticipative processes of the type a, = a(x,, t) and b, = b(x,, t), t = 0, where 
(x,);+0 is a random process. A stochastic process (x,),+9 is then an Ito process if there exist 
two nonanticipative processes (a,),.9 and (b,),.9 such that the conditions 


t t 
P f laslds < œ =1 and P [ Ras <0 =1 
oo 0 


are satisfied, and 
t t 
x, = 4+ f a(x,,sds+ f b(x,, s)dW,, (1.109) 
0 0 


for t > 0. These probability conditions are commonly imposed smoothness conditions on the 
drift and volatility functions. This stochastic integral equation is conveniently and formally 
abbreviated by simply writing it in SDE form: 


dx, = a(x,, thdt+b(x,, t)dW,. (1.110) 


We shall use SDE notation in most of our future discussions of Itô processes. 
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Itô integrals give rise to an important property, known as Doob—Meyer decomposition. In 
particular, it can be shown that if (M,)o-,<, is a square integrable martingale process, then 
there exists a (nonanticipative) process (f,)9<,<, that satisfies equation (1.102) such that 


t 
M,=My+ f f,dW,. (1.111) 


From this we observe that an Itô process x, as given by equation (1.109) is divisible into a 
sum of a martingale component and a (generally random) drift component. 


Problems 


Xi; 
ti 


- of the Brownian motion in equation (1.76) 





Problem 1. Show that the finite difference *# T 


is a normally distributed random variable with mean u(t;) and volatility o(t;)/,/6t;. Hint: 
Use equation (1.76) and take expectations while using equation (1.79). 


Problem 2. Show that the random variable 
N-1 
é= No a(t;)Ox,,, (1.112) 
i=0 


where ôx, =x, — X, and x, defined by equation (1.76), is a normal random variable. Com- 
pute its mean and variance. Hint: Take appropriate expectations while using equation (1.79). 


Problem 3. Suppose that the time intervals are given by ôt; = t/N, where t is any finite time 
value and N is an integer. Show that equations (1.84) and (1.85) follow in the continuous-time 
limit as N — oo for fixed t. 


Problem 4. Show that the random variable € = 1°”, a(t,)(8W,,)° has mean and variance 
given by 


N 


Elé]= Yi al)6r,, EKE- ELED] =2 Lal)’ (8r)? (1.113) 


i=1 


Hint: Since 6W,, ~ N(O, ôt;) independently for each i, one can use the identity in Problem 2 
of Section 1.6. That is, by considering E[exp(a@dW,,)] for nonzero parameter œ and applying 
a Taylor expansion of the exponential and matching terms in the power series in œ”, one 
obtains E[(6W,,)"] for any n > 0. For this problem you only need terms up to n = 4. 


Problem 5. Show that the distribution p(x, x9; t) in equation (1.86) approaches the one- 
dimensional Dirac delta function 6(x — x9) in the limit t > 0. 


Problem 6. (i) Obtain the joint marginal pdf of the random variables W, and W,, s Æ t. 
Evaluate E[(W, — W,)*] for all s, t > 0. (ii) Compute E,[W?] for s > t. 


Problem 7. Let the processes (x,),.9 and (y,),.9 be given by x, = x)+p,t+0,W, and 
Yı = Yo + Myt + 0,W,, where uy, by, Oy, O, are constants. Find: 


(i) the means E[x,], ELy,]; 
(ii) the unconditional variances Var(x,), Var(y,); 
(iii) the unconditional covariances Cov(x,, y,) and Cov(x,, y,) for all s, t > 0. 


Problem 8. Obtain E[X,], Var(X,), and Cov(X,, X,) for the processes 
t 
(a)X, = Xg ™™ + of e'-Jqw, 1>0, (1.114) 
0 


dW, 
T-s° 





()X, =a =T HBT) +T =) | 0<t<T, (1.115) 
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where a, B, o are constant parameters and time T is fixed in (b). The process in (a) describes 
the so-called Ornstein—Uhlenbeck process, while (b) describes a Brownian bridge, whereby 
the process is Brownian in nature, yet it is also exactly pinned down at initial time and final 
time T, i.e., Xo = a, Xp = B. For (a) assume X, is a constant. 


Problem 9. Assume that x, is described by a random process given by equation (1.94), 
or equivalently by the conditional density in equation (1.86). Show that the conditional 
expectation at time t = 0 defined by 


C(t, K) = E,[(x, — K) 4], (1.116) 


where (x), =x if x > 0 and zero otherwise gives the formula 





CG Wistar nn ( 2B") Laie AA), (1.117) 


where N(-) is the standard cumulative normal distribution function and 


ere (1.118) 





1 
p(x) = ae 


By further restricting the drift, u = 0 gives Bachelier’s formula. This corresponds (from the 
viewpoint of pricing theory) to the fair price of a standard call option struck at K, and maturing 
in time ¢, assuming a zero interest rate and simple Brownian motion for the underlying “stock” 
level x, at time t. Hint: One way to obtain equation (1.117) is by direct integration over all 
x, of the product of the density p [of equation (1.86)] and the payoff function (x, — K),. Use 
appropriate changes of integration variables and the property 1 — N(x) = N(—x) to arrive at 
the final expression. 


1.5 Stochastic Differential Equations and Ito’s Formula 


For purposes of describing asset price processes it is of interest to consider SDEs for diffusion 
processes x, that are defined in terms of a lognormal drift function u(x, t) and a lognormal 
volatility function o(x, t) and are written as follows:® 


dx, = u(x, t)x,dt+ a(x,, t)x,dW,. (1.119) 


Assuming the drift and volatility are smooth functions, the discretization process in the 
previous section extends to this case and produces a solution to equation (1.119) as the limit 
as N — œ of the Markov chain x,,,...,.x,, defined by means of the recurrence relations 


Xia T Xy TMX,» ti) X, Ôt; + olx, ti) x, W, (1.120) 


®When the drift and volatility (or diffusion) terms in the SDE are written in the form given by equation (1.119) 
it is common to refer to u and o as the lognormal drift and volatility, respectively. The reason for using this 
terminology stems from the fact that in the special case that u and ø are at most only functions of time t (i.e., not 
dependent on x,), the SDE leads to geometric Brownian motion, and, in particular, the conditional transition density 
is exactly given by a lognormal distribution, as discussed in the next section. 
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From this discrete form of equation (1.119) we observe that x,,5,—x, = 6x, = w(x,, t)x,ôt + 
o(x,, t)x,dW,. Alternatively, the solution to equation (1.119) can be characterized as the 
process x, such that 


E, [Xa — x] E, [Xas — x] 
t= li tl^t+ôt t N= li t t+ôt t 
u(x, ? )= enn x,t > a(x, Me ay x26t 


(1.121) 


These expectations follow from the properties E,[S5W,] = 0 and E,[(6W,)?] = ôt. Notice 
that, although an SDE defines a stochastic process in a fairly constructive way, conditional 
distribution probabilities, such as the one for the Wiener process in equation (1.86), can be 
computed in analytically closed form only in some particular cases. Advanced methods for 
obtaining closed-form conditional (transition) probability densities for certain families of drift 
and volatility functions are discussed in Chapter 3, where the corresponding Kolmogorov 
(or Fokker—Planck) partial differential equation approach is presented in detail. 

A method for constructing stochastic processes is by means of nonlinear transformations. 
The stochastic differential equation satisfied by a nonlinear transformation as a function of 
another diffusion process is given by It6’s lemma: 


Lemma 1.3. Ito’s Lemma /f the function f, = f(x,, t) is smooth with continuous derivatives 
df/dt, df/dx, and f/x? and x, satisfies the stochastic differential 


dx, = a(x,, t)dt+ b(x,, thdW,, (1.122) 


where a(x, t) and b(x, t) are smooth functions of x and t, then the stochastic differential of 
f, is given by 





a= (Fran 


= A(x, ilies t)dW,. 


e t)? wh) a+ be NEW, (1.123) 


In stochastic integral form: 


t t 
f= for [ A(x,, syds+ | B(x, s)dW,. (1.124) 
0 0 
A nonrigorous, yet instructive, “proof” is as follows.’ 


Proof. Using a Taylor expansion we find 
Of, = f(x, + Ôx, t+ dt) — f(x, £) 


= oF x, N8t+ OF x, t)8x, doe "T x, N(8x,)2 + OCEN ), (1.125) 
ot Ox 


2 dx? 


where the remainder has an expectation and variance converging to zero as fast as (ôt)? in the 
limit ôt — 0. Inserting the finite differential form of equation (1.122) into equation (1.125) 
while replacing (6W,)* —> ôt and retaining only terms up to O(ôt) gives 


or = (annta nL en pt Ce? ona 


+o na Í Cn 18W, +0((60)3 ). (1.126) 


7For more formal rigorous treatments and proofs see, for example, [[W89, @ks00, JS87]. 


34 CHAPTER 1. Pricing theory 


Taking the limit N —> oo (ôt — 0), the finite difference ôt is the infinitesimal differential dt, 
ôW, is the stochastic differential dW,, the remainder term drops out, and we finally obtain 
equation (1.123). Alternatively, with the use of equation (1.125) we can obtain the drift 
function of the f, process: 





E,[6f,] 
A(x, = lim ——— 57 i 
ôf ðf Elex] 18#f . E [8x] 
aen Si Eo St 
b(x,, t}? 0? 
fx, Nta(e, OG, 9+ 29 Lo, 9; 
2 Ox? 


and the volatility function of the f, process: 


B(x,, t)? = lim ai 


= (EY pa EA i (Zeo). 


The drift and volatility functions therefore give equation (1.123), as required. Here we have 
made use of the expectations 


a(x,, t)= = lim a > b(x,, Y = lim E,[(8x)°] 


5t>0 t ôt—>0 ôt 
following from the finite differential form of equation (1.122). O 


Note: It6’s formula is rather simple to remember if one just takes the Taylor expansion of 
the infinitesimal change df up to second order in dx and up to first order in the time increment 
dt and then inserts the stochastic expression for dx and replaces (dx)? by b(x, t)*dt. 

As we will later see, in most pricing applications, x, represents some asset price pro- 
cess, and therefore it proves convenient to consider It6’s lemma applied to the SDE of 
equation (1.119); i.e., a(x, t) = xu(x, t), b(x, t) = xo(x, t), written in terms involving the 
lognormal drift and volatility functions for the random variable x. Equation (1.123) then gives 





of af Per Pf af 
ape ay 1.127 
f (Zeus FO fa i ee) 
=p,f,dt+o,f,dW, (1.128) 


From this form of the SDE we identify the corresponding lognormal drift u; = u p(x, t) and 
volatility a, = o ;(x, t) for the process f,. 

The foregoing derivation of It6’s lemma for one underlying random variable can be 
extended to the general case of a function f(x,,... t) depending on n random variables 


> Xns 


x= (x1, ..., Xn) and time ¢. [Note: To simplify notation, we shall avoid the use of subscript 
t in the variables, i.e., x, = xı, etc.] We can readily derive Itô’s formula by assuming that 
the x,,i=1,...,n, satisfy the stochastic differential equations 


dx, = a,dt +b; Y A,,dW). (1.129) 


j=l 
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Here the coefficients a; = a;(x,,...,%,,t) and b; = b;(x,,...,%,,¢) are any smooth func- 
tions of the arguments. Furthermore we assume that the Wiener processes W} are mutually 
independent, i.e., 


E(dW;dW/] = ô;; dt. (1.130) 


The constants p;; = pj; (with p; = 1) are correlation matrix elements and are convenient for 
introducing correlations among the increments (e.g. see equation (1.176) of Section 1.6): 


E|(dx;)(dx;)] = bb; 2 X Ay An EldWdW;] 


k=1 1=1 


= b;b; Y Ag A gat = b:b,p,jat. (1.131) 


k=1 


When i = j this gives E[(dx;)”] = b?dt. Taylor expanding df up to second order in the dx; 
increments and to first order in dt we have 


2f 


df= dany A Ea (dx;)(dx;) (1.132) 








Now replacing (dx;)(dx;) by the right-hand side of equation (1.131) while substituting the 
above expression for dx; and collecting terms in dt and the dW} gives the final expression: 


ô af be o? 
w= (F+d|« ett |+ 2 bib Pi J -Jar 


i i<j=1 


+E (Leubig oA) aw} (1.133) 


This procedure can be straightforwardly applied or extended to stochastic differentials of 
various processes that are dependent on groups of underlying random variables. 

As we shall see in the coming sections, where we cover derivatives pricing in continuous 
time, it is important to work out the stochastic differential of the quotient of two processes, 
namely; f, = g,/h,, where 


d ee dh Be a 
oS = wy dt+> ol dWi, g = edt + o,dW, (1.134) 


t a t i=1 


are stochastic differential equations assumed satisfied by g, and h,, respectively. Note that the 
drift and volatility functions® are generally considered functions of time and of the underlying 
processes, Py = Mg(8r hi 1), By = BalSi hn t), Fe = Fe(8i h t), Oh = Fy (B My f). The 
function ø, is the voliy of the process g, with respect to the ith independent Wiener 
process (or ith risk factor).? The stochastic differential of the ratio f, = g,/h, can be obtained 
via the Taylor expansion of the differential df up to first order in dt and up to second order 


8 Here and throughout the rest of the book we shall sometimes take the liberty to refer to the lognormal drift and 
volatility functions simply as the drift and volatility so as to avoid excessive use of such terminology. 
°In what follows we shall at times also refer to independent Brownian motions as risk factors. 
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in the dg and dh terms. Hence considering f as function of g, h, and t and taking appropriate 
partial derivatives gives 


E 


1 
df= ;dg— > 


h 





dh z (dg)(dh) + 5 (ah. (1.135) 


Here a = 0, since there is no explicit time dependence. Moreover, since ff = = 0, the (dg)? 


term is absent. This last SDE takes on a particularly simple form when we divide through by f: 
df dg dh dgdh (dh\’ (dg dh dh 

f_ dsg EAN i: zf 1 (1.136) 
f g h g h h g h h 


Substituting equations (1.134), expanding out, and setting to zero any term containing 
(dWŻ)(dt) or (dt)? [i.e., terms of O((dt)*’*) or higher] then gives 








df L igi i Lai i i 
a He on -Zoilo a+ Zoi- eaw. (1.137) 
i=l i=l 
Here we have also made use of the replacement dW; dW; = 6;; dt. This gives the stochastic 
differential of f, = g,/h,. Note that this equation in compact form reads 
df 


T udt Yo .dWi, (1.138) 
c i=1 


where the drift of f is by = by — By - De 1 o (0; —oi) and the volatility is given 
by o; = = o} — ø$. It is important to note that pricing formulas ultimately involve the 





absolute value or square of the volatilities, i.e., o} = |o — oį| = = (oi? + (g},)? — 200}. 
This will become clear in the sections that follow. Namely, a rigorous justification of 
this arises from consideration of the partial differential equation (i.e., the forward or 
backward Kolmogorov equation) satisfied by the corresponding transition probability 
density function, which explicitly involves only terms in the square of the volatilities. 
Finally, note that for the case of only one risk factor, i.e., n = 1, we have equation (1.138) 
with My = by — My — %, (0, — On) and o, = 0, — Oh. For general n, using vector notation 
(0, =O, —Oy, My = Mg — By — Tr: (0, — G;,)) and equation (1.138) takes the form: 


z =p,dt+oa,-dW,. (1.139) 

Recall that a martingale process, which we shall here simply denote by f,, is a stochastic 
process for which E?[f;] = f,, t < T, under a given probability measure P. Recall that this 
is a driftless process, in the sense that its expected value, under P, is constant over all future 
times. We have already encountered a simple example of such a process, namely, the standard 
Brownian motion, or Wiener process W,. Equation (1.90) provides a method of generating a 
martingale process. Based on It6’s Lemma we now have the following result. 


Theorem. (Feynman-Kac) Jf f(x,t) is the function given by the conditional expectation 


f(x, t) = Ele (xr)], (1.140) 
at time t < T, with x, = x and underlying process obeying equation (1.122), then f(x,t) satisfies 
the partial differential equation 


a Yy Of(x,t) — b(x, t)? P f(x, t) 


= 1.141 
ee) ax y 2 Ox? 0, ( ) 





with terminal time condition f(x, T) = (x). 
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Proof. The proof follows by considering the conditional expectation of equation (1.126) at 
time t, which leaves us with only the drift term in ôt (to order ôt), since the Wiener term is 
Markovian. On the other hand, 


E,[6f,] = E [fas] -f= 0. (1.142) 


The last equality is due to the martingale property of f,. In the limit of infinitesimal time step 
we are left with the infinitesimal drift term, which vanishes identically only if equation (1.141) 
is satisfied. The terminal condition follows simply because f(x, t = T) = E [o (xr)] = 6(x), 
with x, = x imposed when t= 7. O 


The Black-Scholes partial differential equation discussed in Section 1.13 is a special case 
of the Feynman—Kac result. The generalization of equation (1.141) to n dimensions is also 
readily obtained by using It6’s lemma in n dimensions. 


Problems 


Problem 1. Consider the stochastic processes g, and h, defined earlier. Further assume that 
the volatilities of the two processes are identical with respect to all Brownian increments, 
ie., o} = g} for all i. Show that the process f, = g,/h, is deterministic with solution 


fr= Arex f (len 9- mh 3)as). (1.143) 


Problem 2. Consider two processes defined by g, = gọes +s! and h, = hye™™:+#«", where 


W, is a standard Wiener process and M,, Mn, Og, Th, 8o, and ho are constants. Use Itô’s lemma 
to show that 
dg, o; dh, o? 
re Mgt dt+o,dW,, oT Hitz dt+oa,dW,. (1.144) 
t t 


Then assume df,/f, = ¢,dt+o,dW,. Find these drift and volatility coefficients in terms of 
Mg, My, Tg, and o}, for the cases f, = g,/h, and f, = g,h,. 


Problem 3. Obtain the stochastic differential equations satisfied by the Ornstein—Uhlenbeck 
and Brownian bridge processes in Problem 8 of Section 1.4. 


1.6 Geometric Brownian Motion 


Univariate geometric Brownian motion with time-dependent coefficients is characterized by 
the SDE of the form 


dS, = w(t)S, dt-+o(t)S, dW,, (1.145) 


with initial condition Sọ, where u = u(t) and o = g(t) are deterministic functions of time t. 
This equation can be solved by means of the change of variable 


S, 


y 1.146 
3 (1.146) 


x, = log 
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The transformed equation is obtained using It’s lemma, 


a(t)’ 


dx, = (m) — a) dt+o(t)dW,, (1.147) 


and is to be solved with initial condition x) = 0. Following the procedure in Section 1.4 we 
discretize this equation in the time interval [0,7] using a partition in N subintervals of length 


=f; 
ôt= x: 


2 
ee: (ue) = 2) ôt + o(t;) ôW, (1.148) 


By iterating the recurrence relations up to time T, we find 


ce 5 (mo — ur) ôt + o(t;) aw, | ‘ (1.149) 


i=0 


Hence x, is a normal random variable for all N > 1. In the limit as N — oo, the mean of x, 
is given by 


E,[xr] = im D (min) 7 ôt = n (ue = a) dt (1.150) 


and the variance is given by 
N-1 T 
E,[x7] — (Elx)? = jim Y a(t,’ 6t= f a(t? dt. (1.151) 
—> 00 i=0 0 


Introducing the time-averaged drift and volatility 


aT) = af u(t)dt (1.152) 


a(T) = rf oo dt, (1.153) 


we conclude that x; = log a ~N (iT) = zm) T, (T)T |. This result is also easily 


and 


verified by directly applying properties (1.105) and (1.106) to the integrated form of equa- 
tion (1.147). 
The solution to stochastic differential equation (1.145) for all t > 0 is hence 


s,= syexp (10-72) + 000m,), (1.154) 


where a(t) and a(t) are given by equations (1.152) and (1.153), respectively. This solution 
(which is actually a strong solution) can also be verified by a direct application of It6’s lemma 
(see Problem 1). Note that this represents a solution, in the sense that the random variable 
denoted by S, and parameterized by time f is expressed in terms of the underlying random 
variable, W,, for the pure Wiener process. 
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This solution gives a closed-form expression for generating sample paths for geometric 
Brownian motion. Equation (1.154) provides a general expression for the case of time- 
dependent drift and volatility. It is very instructive at this point to compute expectations of 
functions of S,. Let us consider the process in equation (1.145) and proceed now to compute the 
expectations E,[S,] and E,[(S,—K),.], for some constant K > 0, where (x), = max(x, 0) = x 
if x > 0 and zero if x < 0. Using the solution in equation (1.154), the expectation of S, under 
the density of equation (1.95) (i.e., conditional on S,_9 = Sọ, hence we write Ep[]) is 


E[S] = Spe "Eyl e7™] 
= Spe PPD e7 = S eF. (1.155) 
To compact notation we denote jf. = a(t), © = g(t). In the last step we have used an important 
identity derived in Problem 2 of this section. This result shows that the stock price is expected 


to grow exponentially at a rate of u. 
Using equation (1.154), the expectation E,[(S,—K),] is given by 


E,[(S, — K) 4] =f p(y, t) (Sye@-#7/ 2 —K),dy 


(j1—G?/2) o0 
_ eP SAGE K oem) dy (1.156) 
~ 2t —00 0 $ 


The last step obtains from the identity (ax — b), = a(x — b/a), for a > 0. Changing inte- 
gration variable y = ./tx while employing this identity again gives 


2 


G-F)t p00 ‘ 
Soe- 2 ae = See) y (1.157) 
+ 


sN 2T —o0o 0 


Since e~°¥" is a monotonically decreasing function of x, there is a value Xx such that 


E.(S,—K),] = 


1— K la-a] x> Xx 


(1 sd elamata) = So (1.158) 


0 + 
0, X< Xx 


where 





come log(So/K) + (a — G7 /2)t 


Hence, the integral in equation (1.157) becomes a sum of two parts in the region x € (xx, œ): 


(1.159) 


a2 
S eB-F)t po 7 K 00 
oF erai dy __ f| e2 dx. (1160) 


sN 2T XK sN IT XK 


Completing the square in the first integration gives 


Eo[(S,— K),]= 


Eol (S, — K),] = S (1 Mxx av) KN(—Xxx) 
= Spe! N(ī Vt — xg) — KN(—Xx) 
= Spe N(d,.) — KN(d_), (1.161) 
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where N(-) is the standard cumulative normal distribution function and 





log(Sy/K) + (a6? /2)t 
d= = ’ 
ba F alt 
Note that here we have used the property M(—x) = 1 — N(x). 
The Black-Scholes pricing formula for a plain European call option follows automatically. 
In particular, assuming a risk-neutral pricing measure, the drift is given by the instantaneous 


risk-free rate u(t) = r(t). Hence, the price of a call at current time (t = 0) with current stock 
level (or spot) So, strike K, and maturing in time f is given by the discounted expectation 





(1.162) 





Cy(So, K, t) = &"E,[(S, — K) 4] = SypM(d,) —e"KN(d_), (1.163) 


where r is the time-averaged continuously compounded risk-free interest rate 


rF=7()= ay r(7)dr, (1.164) 


and d, is given by equation (1.162) with 4 = F. It is instructive to note the inherent difference 
between the Black-Scholes pricing formula in equation (1.163) and Bachelier’s formula in 
equation (1.117). Bachelier’s formula is a result of assuming a standard Brownian motion 
for the underlying stock price process [i.e., equation (1.94)]. In contrast, formulas of the 
Black-Scholes type are equivalent to the assumption of geometric Brownian motion for the 
underlying price process. Using equation (1.154) as defining a change of probability variables 
W, — S,, the one-dimensional analogue of equation (1.48) together with equation (1.95) gives 





1 Se ee 2942 
P(S,, So; À = op [log (S¢/S0)-G—a"/2)t1" 20° 1 (1.165) 
ie S,a/2nt 


This is the lognormal distribution function defined on positive stock price space S, € (0, 00). 
The log-returns log(S,/S,) are distributed normally with mean (4 — @?/2)t and variance Gt. 
Setting a = r gives the risk-neutral conditional probability density for a stock attaining a 
value S, at time ¢ > 0 given an initial value Sọ at time t = 0. Hence, the Black-Scholes 
pricing formula for European options can also be obtained by taking discounted expectations 
of payoff functions with respect to this risk-neutral density. In particular, a European-style 
claim having pay-off A(S,) as a function of the terminal stock level S}, where T > 0 is a 
maturity time, has arbitrage-free price f)(Sp, T) at time t = 0 expressible as 


fo(Sp, T) = TOT EG[A(Sp)] = 77 f P(Sp, So; T)A(Sr)dSr. (1.166) 
0 


Here the superscript Q is used to denote an expectation with respect to the risk-neutral density 
given by equation (1.165) with drift a = F(T). Note that within this probability measure, 
equation (1.155) shows that stock prices drift at the time-averaged risk-free rate r(t) at time t. 
As will become apparent in the following sections, this must be the case in order to ensure 
arbitrage-free pricing. 

For pricing applications, discussed in greater length in later sections of this chapter, it 
is useful to consider a slight extension of the foregoing closed-form solutions to geometric 
Brownian motion. Namely, we can extend equation (1.154) by a simple shift in time variables 
as follows: 


Sr = S,exp (a. T)- TOD cr th+a(t, PWr), (1.167) 
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with time-averaged drift and volatility over the period /t, T] 


nt, T) = af mer (T) = af er. (1.168) 


Here W,;_, = Wr — W, is the Wiener normal random variable with mean zero and variance T — 
t; i.e., Wp_,~ VT — tx, x~ N(0, 1). For constant drift and volatility this solution simplifies 
in the obvious manner. The formula for the conditional expectation now extends to give 


E,[(S;— K),] = e#7S,N(d,) — KN(d_), (1.169) 


with 





ie EGEE a” /2)(T 1) (1.170) 
oVvT-t 








and “= p(t, T), © = a(t, T). A related expectation that is useful for pricing purposes is (see 
Problem 3) 


E,[(K — S;),] = KN(—d_) — e#@-S,N(—d,,). (1.171) 


Within the risk-neutral probability measure, u = r. Hence discounting this expectation by 
e~"7-9 gives the analogue of equation (1.163) for the Black-Scholes price of a put option at 
calendar time ¢, spot S,, and maturing at time T with strike K: 


P(S,,K, T) =e"? °KN(—d_) — S,N(—4d,), (1.172) 


where d, is given by equation (1.170) with a = F = r(t, T). 

In closing this section, we consider the more general multidimensional case of geometric 
Brownian motion. Multivariate geometric Brownian motions describe n-dimensional state 
spaces of vector valued processes S/,..., 5” and can be described with two different but 
equivalent sets of notations. Let’s consider n uncorrelated standard Wiener processes 





1 
wi. 


..,W/, with E, [dW dW}] = ô, dt. (1.173) 


A simple way to introduce correlations among the price processes is to allow for correlated 
Wiener processes by defining a new set of n processes W* as 


dws =~ A; dWi, (1.174) 
j=l 
or, in matrix-vector notation, 
dW? = A-dW,. (1.175) 
Using equation (1.174) we have 
E, [aw aw," | = F AgAy 8g dt= Di AyAy dt = p,; dt, (1.176) 
kl=1 k=l 


where the last relation defines a correlation matrix p, with elements p;;, and lower Cholesky 
decomposition given by 


p=AN, (1.177) 


Throughout this section, superscript + denotes matrix transpose. 


42 CHAPTER 1. Pricing theory 


Stochastic differential equations for the stock price processes can be written as follows: 


dsi 
S i 


t 


= u; dt+o, dw’ (1.178) 


= u; dt+0;,X. A; dW! = u; dt+ Y L; dw) (1.179) 


j=1 j=1 


where the last expression defines the matrix L, L;; = 0;A;;. Note that the lognormal drifts 
H; and volatilities ø; can generally depend on time, although to simplify notation we have 
chosen not to denote this explicitly. The last relation in equation (1.179) defines a lower 
Cholesky factorization of the covariance matrix 


C =LL' = ZMA Y = Spd. (1.180) 


Here & is the diagonal matrix of lognormal volatilities with (ij)-elements given by ô;; 0;, 
L = XA and X = 3". In vector notation we can write equations (1.179) in a compact form as 





dsi 
= = u; dt+oa;-dW,, (1.181) 
t 
where o; = (O, ..., O) is the volatility vector for the ith stock, whose jth component 


0;; = L;; gives the lognormal volatility with respect to the jth risk factor. 
Equation (1.61) in Section 1.2 gives L for the case n = 2. In particular, in the case of two 
stocks we can introduce a correlation p, where equations (1.179) now take the specific form 





ds} 

I =p, dt+o, dW}, (1.182) 
t 

dS? 1 2 2 

2 ee dt+ po, dW, +/1-—p’o, dW,, (1.183) 
t 


with infinitesimal variances and covariances 


dS!\" dS N? y dS! dS? 
E, (2) | = 0 dt, E, K = 0; dt, E, os s = p00 dt. (1.184) 


For this case the volatility vectors are given by a, = (c, 0) and o, = (Po, Ty 1 — p?) for 
stock prices $} and S?, respectively. 

More generally, equations (1.179) [or (1.181)] describe geometric Brownian motion for 
an arbitrary number of n stocks with infinitesimal correlations and variances: 


dSi dS! dsi\’ 
E, | —— | = C; dt, E, ($) =o? dt. (1.185) 
Si S Si 


The vectors ø, are seen to be given by the ith rows of matrix L, i.e., the matrix of the lower 
Cholesky factorization of the covariance matrix. 

A solution to the system of stochastic differential equations (1.179) [or (1.181)] is readily 
obtained by employing a simple change-of-variable approach (see Problem 4). In particular, 











2 n : 
st =Stexp ((m- F)r-9 +0 Ayw); i=1,...,n, (1.186) 


j=l 
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where we denote wi i= wi — Wj, for each jth independent Wiener normal random variable 
with mean zero and variance T — t; i.e., wi, = VT —tx;, x; ~ N(0, 1) independently for 
all j= 1,...,n. From this result one readily obtains the multivariate lognormal distribution 
function p(S;,S8,; T — t), i.e., the analogue of equation (1.165) [see equation (1.198) in 
Problem 5]. The pricing of European-style options whose pay-offs depend on a group of n 
stocks, i.e., European basket options, can then proceed by computing expectations of such 
pay-offs over this density, where the drifts are set by risk neutrality. That is, let’s assume 
a money-market account B, = e” with constant risk-free rate r, then within the risk-neutral 
measure the stock prices must all drift at the same rate, giving u; = r.'° Let V, denote 
the option price at time ¢ for a European-style contract with payoff function at maturity time 
T given by Vr = II(S;), Sr = (S;,..., 5%). The arbitrage-free price is then given by the 
expectation 


V, =e") FPP TIS, )] 


=e | P(Sr, Si; T — NS; )dSy 
RY 


eT) “tgp? 
= E ha 28? T1($-(x))dx, (1.187) 
where S,(x) has components S} (x) given by equation (1.186), x = (x,,...,x,). The price 


hence involves an n-dimensional integral over a multivariate normal times some payoff 
function. Exact analytical expressions for basket options are generally difficult to obtain, 
depending on the type of payoff function as well as the number of dimensions n. Numerical 
integration methods can be used in general. Monte Carlo simulation methods are very useful 
for this purpose. The reader interested in gaining insight into the numerical implementation 
of standard Monte Carlo methods for pricing such options is referred to Project 8 on Monte 
Carlo pricing of basket options in Part II of this book. 

Exact analytical pricing formulas for certain types of elementary basket options, however, 
can be obtained, as demonstrated in the following worked-out example. 


Example. Chooser basket options on two stocks. 


Consider a basket of two stocks with prices S} (for stock 1) and S? (for stock 2) modeled 
as before with constants 44, M2, P, Ci, O2. Specifically, the risk-neutral geometric Brownian 
motions of the two stocks are given by 


o2 
SES Gua) Selita, (1.188) 





o2 f= 
S? = SÈ (x1, X2) = SeT ETH Tont 120), (1.189) 


where Sj, Sj are initially known stock prices at current time t = 0. The earlier pricing 
formula gives 


e 


—rT oo oo 
se ff PPS. x2), S 2))dx1 da (1.190) 





Vo = 


10This drift restriction is further clarified later in the chapter where we discuss the asset-pricing theorem in 
continuous time. 
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for the general payoff function. A simple chooser option is a European contract defined by 
the payoff max(S}, $7). This pay-off has a simple relation to other elementary pay-offs; i.e., 
max(S}, SZ) = (S2 — SŁ), +S; = (Sp — S}), + S}. The current price Vy of the simple chooser 
is hence given by V) = Co + Sl, where Cy denotes the price of the contract with payoff 
(SZ —S}),. This follows since an expectation of a sum is the sum of expectations and from 
the fact that the stock prices drift at rate r; i.e., eT ELP [Si] = Si. The problem remains to 
find the price Cp given by the integral 


e 


—1T o0 oo 
=l Í e HID (S2(x,, x) =S at) day dry. (1.191) 





C= 


The integrand is nonzero on the domain {(x,, x2) € R?; S7.(x,, x2) > S-(x,, x»)}. From equa- 
tions (1.188) and (1.189) we find the domain is {(x,, x2) € R?; x, < ax, +b}, where 


On/1— p? = log(S&/S5) + 4 (0? — o3)T 
(0, — po)’ (a, —po,)VT 


Here we assume go, — po, > 0 and leave it to the reader to verify that a similar derivation 
of the same price given next also follows for the case a, — po, < 0. Using this integration 
domain and inserting expressions (1.188) and (1.189) into the last integral gives 


i2 

Seant p a T Atb 1 2 

C= 0 ; Í e73% 1—p20,VT x, f eT TPN Tx dy, dx 
T — —oo 





a= 


i 

Sleire pe 4g axytb 1 2 z 

a e e222 Í erita gy dx, 
T —0o —0o 


By completing the square in the exponents, the integrals in x, give cumulative normal 
functions N(-). In particular, 


Ste-20 Pleat 


C= eat -PoTN ax, +b —po,/T)dx, 


N 2T —00 


= He ia e722 N(ax, +b—0,VT)dxp. 
At this point we make use of the integral identity (see Problem 6), 
AC+B 
ya) 





“2 4C°N( Ax + B)dx = een( (1.192) 


1 oo 
mem e 
NV 2T J. 
for any constants A, B, and C, giving 
(ay 1— ae tt gw( ee), 
1+a@ 1 +a? 





Cus sv ( 


After a bit of algebra, using a and b just defined, we finally obtain the exact expression for 
the price in terms of the initial stock prices and the effective volatility v as 


Cy = SÊN(d }) — ShN(d_), (1.193) 


with 





4 __ log(S¢/Sp) = vT 


+ aT , (1.194) 








Dies AD 
v“ = 0; +03; — 290,04. 
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Changes of numeraire methods for obtaining exact analytical solutions in the form of 
Black—Scholes—-type formulas for basket options on two stocks, as well as other options 
involving two correlated underlying random variables, are discussed later in this chapter. 


Problems 


Problem 1. Use It6’s lemma to verify that equation (1.154) provides a solution to equa- 
tion (1.145). 


Problem 2. Consider an exponential function of a normal random variable X, e** for any 
parameter a, where X € (—oo, œo) has probability density at X = x given by 


1 
P(x, t) = mee (t > 0). 


Show that 
E[e*] = exp (a°t/2). 


Hint: make use of the integral identity 


he 2 T p2 
/ ew +bx dx= —e? 1a. 
—oo a 


where a > 0 and b are constants. 


Problem 3. Derive the expectation in equation (1.171) by making use of the identity 
(a-b), =(b—a),+a-—b. 


Problem 4. Consider the general correlated n-dimensional geometric Brownian process dis- 
cussed in this section. Use It6’s lemma to show that the processes Y’ = log Si obey 


dY! = (u;—07/2)dt+0; > Ad W}. (1.195) 
j=l 


Assuming all volatilities are nonzero, the correlation matrix is positive definite. Hence, A has 
an inverse A~'. Define new random variables X} = Y"_, 0; 'A;'Y} and show that 


dX} = ù; dt+dwj, (1.196) 
with ft; = Dj, 07 'Aj' (M; — $07), has solution 
Xi=Xi+f(T—-)+We-W),  j=1,...,n. (1.197) 


Invert this solution back into the old random variables, hence obtaining equation (1.186). 


Problem 5. Treat wi, and log(Si./S') as two sets of n independent variables in equa- 
tion (1.186) and thereby compute the Jacobian of the transformation among the variables. 
Then invert equation (1.186) and use the identity in equation (1.48) with the distribution 
function for the n independent uncorrelated Wiener processes to show that the analytical 
formula for the transition probability density for geometric Brownian motion is given by 


p(Sz,8,; T — À) = (2a(T —1))-2|C|-? exp (— tz- C7! - z), (1.198) 
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where the n-dimensional vector z has components 


i JS) — (u. — lo — 
p = P8(S1/S) a E 9) (1.199) 





Problem 6. Using the definition of the cumulative normal function, write 


l f Hae tCr N( A +B)d 1 f at -3d |e (1 200) 
—— e x x= — e e X: : 
J 277 I- 277 Joo a y 
Introduce a change of variables (n, €) = (y — Ax, y+ Ax) and integrate while completing 
squares to obtain equation (1.192). 


1.7 Forwards and European Calls and Puts 


Consider a situation with a stock price that at current time t = 0 has price Sọ while at time 
T > 0 in the future is described by a certain random variable S+. Suppose that there is also a 
zero-coupon bond maturing at time T, i.e., a riskless claim to one unit of account at time T. Let 


ZA(T) = eT» (1.201) 


be its price at time t. Here r is the yield up to time T. Unlike the rate introduced in 
equation (1.5), in this case r is defined with the continuously compounded rule; we refer 
again to Chapter 2 for a more systematic discussion of fixed-income terminology. 

Let’s consider a situation where S, is contained in the half-line of positive real numbers 
R,. Let P be the real-world measure with density p(S); P is inferred through statistical 
estimations based on historical data. Pricing measures, instead, are evaluated as the result of 
a calibration procedure starting from option prices. Also, as discussed in detail later in this 
chapter, pricing measures depend on the choice of a numeraire asset. In our framework, a 
numeraire asset is given by an asset price process, g,, that is strictly positive at initial time 
t = 0 and any other future time t, t < T. The corresponding pricing measure is denoted by 
Q(g), specifying the fact that the asset price g, is the chosen numeraire. A possible choice of 
numeraire is given by the bond g, = Z,(T); this choice corresponds to the pricing measure 
denoted by Q(Z(T)), which is called the forward measure. Note that since r is constant, this 
also coincides with the risk-neutral measure. Technically speaking the name for the risk- 
neutral measure corresponds to using the continuously compounded money-market account 
B, = e” (i.e., the continuously compounded value of one unit of account deposited at time 
t = 0 earning interest rate r) as numeraire.!! For constant interest rate, the two measures are 
then easily shown to be equivalent since Z,(T) = B,/B,;. This point is further clarified in 
Chapter 2. Other choices of numeraire asset are also possible; for example, g, = S, corresponds 
to using the stock price as numeraire. As mentioned earlier and also described in detail later 
in the chapter, expectations taken based on the information available up to current time t with 
respect to the pricing measure Q(g), with g, as numeraire asset price, are denoted by ELOJ]. 
In this section, note that (without loss in generality) we are simply setting t = 0 as current 
time and allowing T to be any future time. 


'lNote that we previously used the symbol B, to denote the bond price. However, here we instead use B, to 
denote the value of the money-market account. 
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By applying risk-neutral valuation to the zero-coupon bond, we find that 
Z)(T) = e" = aE [Z (T)] = aE? ™ [1] = o, (1.202) 


where Z,(T) = 1. Hence, the discount factor œ can be interpreted as the initial price of the 
zero-coupon bond. Although we have not yet formally introduced continuous-time financial 
models at this point in the chapter, the arguments presented in this section are generally valid 
if we assume dynamic trading is allowed in continuous time. 

Risky assets are modeled by a function @: R} —> R of the stock price at time T. Let 
(A,)o<;<r be a price process such that A; = (Sr), such an asset is called a European-style 
option on the stock S with maturity T and payoff function $(S,). Applying the asset-pricing 
theorem, the arbitrage-free price A, at time t = 0 of this option can be written as a discounted 
expectation under a pricing measure Q(Z(T)), 


Ay = eT ERF [6(S,)]. (1.203) 


An alternative and instructive way of writing this equation is 


Ao _ ,ar)| Ar 
Zp zal ane 


Although the numeraire asset in equation (1.204) is the riskless bond Z,(T), the pricing 
formula can be extended to the case of a generic numeraire asset g. Let’s denote Q(g) as the 
probability measure, with g, as numeraire asset price at time f, and defined so that 


A A 
“0 = pO) =| (1.205) 
80 ET 


for all random variables A; = $(S,) and for all T > 0. Assuming the price is unique, equating 
the price Ay in equation (1.204) with that in this last equation gives a relationship for the 
equivalence of the two pricing (or probability) measures: 


Ege [2E = zinego SEE (1.206) 


A variety of numeraire assets can be chosen for derivative pricing. Depending on the 
pay-off, one choice over another may be more convenient for evaluating the expectation and 
hence obtaining the derivative price, as seen in detail in the examples of pricing derivations 
in Section 1.12. 

A forward contract on an underlying stock S stipulated at initial time t = 0 and with 
maturity time t = T is a European-style claim with payoff S} — F, at time T. Here F, is the 
forward price at time t = 0. Forward contracts are entered at the equilibrium forward price Fo, 
for which their present value is zero. A simple arbitrage argument gives a (model-independent) 
forward price Fo as 


Fy =Z,(T)'S). (1.207) 


Indeed, to replicate the pay-off of a forward contract one can buy the underlying stock at 
price Sọ and carry it to maturity while funding the purchase with a loan to be returned also 
at maturity. The nominal of the loan to be paid back at time T is then Z,(T)~'Sy (e.g., this 
equals e’’ S) if we assume a constant interest rate). 
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Since the forward contract is initially worthless, the valuation formula yields 
0 = ELFI s,— Fo). (1.208) 
Since Fy is constant, we have that 
ERC § 0] = Fy = ZT)! So = e” Sp. (1.209) 


The interpretation of this formula is that, under the pricing measure Q(Z(T)), the expected 
return on a stock is the risk-free yield r over the maturity T. The argument just outlined is 
model independent and can be shown to extend to all assets with no intermediate cash flows, 
thus no carry costs, before maturity time T. The expected return on any asset under the pricing 
measure Q(Z(T)) is the risk-free rate, no matter how volatile they are. Also notice that the 
expected return with respect to the real-world measure is quite different. 

The popular geometric Brownian motion model, also called the Black-Scholes model, 
gives a lognormal risk-neutral probability density for the stock price process. As derived in 
Section 1.6, the stock price at time T is a lognormal random variable, 


2 
sp=Syexp((r- F) r+oVTs), (1.210) 


where x ~ N(0, 1) and ø > 0 is the model volatility parameter. As we have seen, the risk- 
neutral distribution for Sy is defined in such a way as to satisfy the growth condition in 
equation (1.209) 


1 p” 2 . 
Bore ig = roe ie Sy exp ((r- =) T+ ovTx) e? dx= Se", (1.211) 


Two important examples of European-style securities are the call option struck at K and of 
maturity T with price process C, and payoff function 


Cr = (Sr — K), (1.212) 
and the put option struck at K and of maturity T with price process P, and payoff function 
Pr = (K — Sr). (1.213) 


Theorem 1.3. (Put-Call parity). If C)(S), K, T) and P,(So, K, T) denote the prices at time 
t = 0 of a plain European call and a plain European put, respectively, both maturing at a 
later time T and both struck at K, then we have the put-call parity relationship, namely, 


Calo» K, T) — Py (Sys K, T) = Sy — KZ, (T). (1.214) 


The proof of the put-call parity relationship descends from the fact that a portfolio with a 
long position in a call struck at K and maturing at T and a short position in a put struck at K 
and maturing at T has the same pay-off as a forward contract stipulated at the forward price 
K. (See Section 1.8.) 

In contrast to the put-call parity relationship in equation (1.214), the evaluation of the 
price of a call or put option requires making an assumption on the measure Q(Z(T)) and the 
stock price process. Under the Black-Scholes model, where the stock at time T is given by 
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equation (1.210), the expectation Fees — K),] can be reduced to a simple integral. 
As shown in a detailed derivation in Section 1.6, 


ECCS,- K) 4] = Spe” N(d,.) — KN(d_), (1.215) 


where M(-) is the standard cumulative normal distribution function, 





specie log(S)/K) + (r + o° /2)T 
e oVT >, 


and the pricing formula for a plain European call option (with constant interest rate) in the 
Black-Scholes model is 





(1.216) 





Css (So; K, T, ©, r) = eRe OS: = K),] 
= SyN(d,.) —Ke~"" N(d_). (1.217) 


European put options are priced analytically in similar fashion by computing the expectation 
a (K — S,),], as seen in the derivation of equation (1.172) of Section 1.6. From 
this formula, or by applying the put-call parity relation (1.214) using equation (1.217), we 
have the equivalent formulas for the put option price: 


Pps(So, K, T, 0, r) = e7 ES [(K — Sy), ] 
= S)N(d,) — Ke" N(d_) — Sọ + Ke” 
= Ke" N(—d_) — S)M(—d,). (1.218) 


A direct calculation shows that the functions Cg, and P,, satisfy the Black-Scholes partial 
differential equation (BSPDE). Analytical and numerical methods for solving this equation 
are discussed at length throughout later sections and chapters of this book. The numerical 
projects in Part II provide implementation details for finite-difference lattice approaches to 
option pricing. A derivation of the BSPDE based on a dynamic replication strategy is provided 
in Section 1.9 (and a general derivation is given in Section 1.13), but here we simply quote 
it for the purposes of the present discussion. In terms of the partial derivatives with respect 
to the time to maturity T and current stock price Sọ (with r and o constants) this equation 
can be rewritten in the form 

dV PPV ôV 
I 2 oS +189 — rV, (1.219) 





So 


where the option value V = V(Sọ, T). The original Black-Scholes equation is really a 
backward-time equation involving 0V/dt in calendar time t, where the price V is expressed in 
terms of ¢ and equals the pay-off at maturity (or expiry) t = T. That is, if we were to express 
the option value explicitly in terms of such a function of calendar time t, then, for example, 
for the case of a call struck at K, C(S, t = T) = (S— K),. Note that in the present context, 
however, since we are expressing the option value with respect to the time to maturity, denoted 
here by the variable T, the option price equals the pay-off when T = 0 (i.e., at zero time to 
expiry): Cgs(S, K, T = 0) = (S — K), and Pg.(S, K, T = 0) = (K —S),, as is easily verified 
via equations (1.217) and (1.218) in the limit T — 0. Since the Black-Scholes equation is 
time homogeneous for time-independent interest rate and volatility, option prices are gener- 
ally functions of T — t (where t and T > t represent actual calendar times), so 0/dt = —0/0T 
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in the original Black-Scholes equation. By replacing T — t — T (without loss in generality 
this corresponds to setting current time t = 0), we further simplify all expressions, wherein 
T now represents the time to maturity. The form in equation (1.219) is convenient for the 
following discussion. 

Whether the pricing measure Q(Z(T)) is unique or not depends on the choice of hedg- 
ing instruments. The asset-pricing theorem (in the single-period setting as stated earlier and 
in the continuous-time case discussed later in this chapter) only implies that — assuming 
absence of arbitrage — there exists such a measure and that this measure prices all pay-offs. 
Indeterminacies in Q(Z(T)) arise in case there is no perfect replication strategy for the given 
pay-off, which can be priced independently. The Black-Scholes model provides the most 
basic pricing model that captures option prices through the single volatility parameter o. 
Since in finance there is no fundamental theory ruling asset price processes, all models 
are inaccurate to some degree. The Black-Scholes model is perhaps the most inaccurate 
among all those used, but also the most basic because of its simplicity. Inaccuracies in the 
Black-Scholes model are captured by the implied volatility surface, defined as the function 
Ops(K, T) such that 


Cas (Sos K, T, On5(K, T), ) = Cy(K, T), (1.220) 


where C)(K, T) is the observed market price of the call option struck at K and maturing at 
time T. This describes a surface o; = 0,;(K, T) in which the implied volatility a, is graphed 
as a function of two variables K, T, i.e., across a range of strikes K and time to maturity 
values T. For any fixed pair of values (K,T) (and assumed fixed Sọ, r), the function Css 
is monotonically increasing in ø [see equation (1.222)], hence the preceding equation can 
be uniquely inverted to give a value for the so-called Black-Scholes implied volatility ©, 
for any observed market price of a call. If the Black-Scholes (i.e., lognormal) model were 
accurate, the implied volatility surface would be flat and constant, for one single volatility 
parameter would price all options. Empirical evidence shows that implied volatility surfaces 
are instead curved (not flat!). 

A practical and widely used approach to risk management involving the Black-Scholes 
pricing formulas is based on the calculation of portfolio sensitivities. Sensitivities of option 
prices in the Black-Scholes model with respect to changes in the underlying parameters 
r, T, S,o are of importance to hedging and computing risk for nonlinear portfolios. Within 
the Black-Scholes formulation, these sensitivities are easily obtained analytically by taking 
the respective partial derivatives of the European-style option price V for a given pay- 
off. The list of sensitivities (also known as the Greeks) are defined as follows, where we 
specialize to provide the exact expressions for the case of a plain-vanilla call under the 
Black-Scholes model: 


e The delta, denoted by A, is defined as the derivative 


óV Ags 
~ 0S, 3S 





= N(d,). (1.221) 


e The vega, denoted by A, is defined as the derivative 


eo G2 


Jin 








ôV ac 
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¢ The gamma, denoted by I’, is defined as the second derivative 


PV Cas eh 














O OS2 GS 2nT we) 
e The rho, denoted by p, is defined as the derivative 
p= a = Css = KTe~" N(d_), (1.224) 
e The theta, denoted by ©, is defined as the derivative!” 
O= om = Ces = (o° S5/2)T +1(S pA — Ces). (1.225) 


The numerical project called “The Black-Scholes Model” in Part II provides the interested 
reader with an in-depth implementation of such formulas for calls as well as for puts 
and so-called butterfly spread options. The corresponding spreadsheet is then useful for 
numerically graphing and analyzing the dependence of the various option prices and their 
sensitivities as functions of either r, o, So, K, or T. 

Given the sensitivities, one can approximate the change in price êC of a call option due to 
small changes T > T+ 6T, So > So +6Sp, € > o +60, r —> r+6r by means of a truncated 
Taylor expansion, 


8C = A(SSy) + A(So(K, T)) + 511655)? + p(dr) + O(ôT). (1.226) 


Here, 6S, dr, da(K, T), and ôT are small changes in the stock price, the interest rate, the 
implied Black-Scholes volatility o = o(K,T), and the time to maturity T of the option at 
hand. In the Black-Scholes model, o(K, T) does not depend on the two arguments and these 
parameters are constant, so the only source of randomness is the price of the underlying. 
However, in practice one observes that implied volatilities and interest rates also change over 
time and affect option values. 

As we discuss in more detail in Chapter 4, the risk of option positions is hedged on a 
portfolio basis and risk-reducing trades are placed in such a way as to decrease portfolio 
sensitivities to the underlyings. In particular: 


e The delta can be reduced by taking a position in the stock or, more commonly, in a 
forward or futures contract on the stock. 
e The vega and gamma can be reduced by taking a position in another option. 
e The rho can be reduced by taking a position in a zero-coupon bond of maturity T. 
Problems 
Problem 1. Derive the formulas in equations (1.221)—(1.225). 


Problem 2. Obtain formulas analoguous to equations (1.221)-(1.225) for the corresponding 
put option with value Pas. 


12Tn other literature this is sometimes defined as —dV/dT. 
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Problem 3. Consider a portfolio with positions 6; in N securities, each with price f; i= 
1,...,N, respectively. Assume the security prices are functions of the same spot Sọ at 
current time fy and that each price function f; = f;(So, T; — tọ) satisfies the time-homogeneous 
BSPDE with constant interest rate and volatility. The contract maturity dates T, are allowed 
to be distinct. Find the relation between the ©, A, and I of the portfolio. 


1.8 Static Hedging and Replication of Exotic Pay-Offs 


Options other than the calls and puts considered in the previous section are often called exotic. 
In this section, we consider the replication of arbitrary pay-offs via portfolios made up of 
standard instruments (i.e., consisting of calls, puts, underlying stock, and cash). In finance, 
such replicating portfolios are useful for the static hedging of European-style options. 

A butterfly spread option maturing in time T is a portfolio of three calls with current value 


1 
By (Sos K, T, €) = = (Co(Sos K — €, T) + Co(Sos K +€, 7) —2Cy(Sp K, T)), (1.227) 


for some € > 0, where Co(So, K, T) represents the (model-independent) price of a European 
call with current stock price So, strike K, and time to maturity T. We observe that (apart from 
the normalization constant) this option consists of a long position in a call struck at K + €, a 
long position in a call struck at K — e, and two short positions in a call struck at K, with all 
calls maturing at the same time. At expiry T — 0 we simply have the payoff function for the 
butterfly spread: 


1 
6.(Sp—K) = a (Cr(Sr, K — €) + Cr(Sr, K +€) —2C;(S7, K)) 


(S; —(K—€)),, Sr <K 
1 
== (1.228) 
E [UK+9 -S SSK 


Here we have used C7(S7, K) = (Sr — K), for the pay-off of a call. The normalization factor 
hence ensures that the area under the graph of the pay-off (as function of S+) is unity, for all 
choices of € (see Figure 1.3). In the limit € — 0, the function 6,(S;— K) converges to the 
Dirac delta function 6(S; — K) (see Problem 1). 

From the one-dimensional version of equation (1.27), we have 








imf 8.(Sp—K)f(K)dK = | 8(8,—K)f(K)dK = (Sy), (1.229) 
E> 0 0 
call spread || 1/2 butterfly spread 
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FIGURE 1.3 Payoff functions for a call spread and a corresponding unit butterfly spread struck at K, 
where 2e is the width of the butterfly spread. 
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for any Sr > 0 and any continuous function f. From the linearity property of expectations and 
risk-neutral pricing we must have 


Bo (So, K, T, €) Se TEP [8. (S+ — K)]. (1.230) 
In particular, we find that in the limit € > 0, 


lim Bo (So, K, T, €) = lim eT E215.(Sp —K)] 


=e" lim | p(So, 0; Sp, T)6,(Sp — K) dS, 


e>0 Jo 
=e f (Sp, 0; Sp, T)E(S} — K) dS, 
0 
=e p(So, 0; K, T), (1.231) 


where p(So, 0; K, T) is the risk-neutral probability density that the stock price S; equals K 
at time tf = T, conditional to its equaling Sọ at initial time t = 0. This result basically tells 
us that the price of an infinitely narrow butterfly spread is the price of a so-called Arrow— 
Debreu security, i.e., the value of a security that pays one unit of account if the stock price 
(i.e., the state) S} = K is attained at maturity. One concludes that knowledge of the prices 
of European calls at all strikes is equivalent to the knowledge of the risk-neutral transition 
probability density p(So, 0; Sr, T) for all S;. Notice, though, that this does not uniquely 
identify the price process under the risk-neutral measure because all possible transition 
probabilities p(S,, t; K, T) for any t > 0 are not uniquely determined.!? By recognizing that 
equation (1.227) is in fact a representation of the finite difference for the second derivative, 
we obtain from the last equation 


PC(Sy,K,T) _ 
dK? E 
We will arrive at this equation again in Section 1.13 when we discuss the Black-Scholes 


partial differential equation and its dual equation. 
Other common portfolios of trades include the following. 


eT p(Sy,0; K, T). (1.232) 


e Covered calls consist of a long position in the underlying and a short position in a 
call, typically struck above the spot at the contract inception. This position is meant to 
trade potential returns above the strike at future time for the option price. The pay-off 
at the option maturity is 


Sr- (Sr—K),. (1.233) 


e Bull spreads are option spread positions consisting of one long call struck at K, and 
one short call struck at K, with payoff function 


(Sp — K,)4 — (Sr — Ky), (1.234) 


K, < K,. This portfolio is designed to profit from a rally in the price of the underlying 
security. 


13 There are in general a variety of models involving jumps, stochastic or state-dependent volatility, or a combi- 
nation of all that result in the same prices for European options but yield different valuations for path-dependent 
pay-offs. 
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Bear spreads are option spread positions in one short put struck at K, and one long 
put struck at K, with payoff function 


—(K, — Sp), + (Ky — Sr), (1.235) 


K, < K,. This portfolio profits from a decline in price of the underlying security. 
Digitals obtain in the limit that (K, — K,) — 0 in a spread option with positions scaled 
by the strike spread (K,—K,)~!. A digital is also called a binary. For instance, the 
pay-off of a bull digital (or digital call) is a unit step function obtained when such a 
limit is taken in a bull spread with (K,—K,)~' long positions in a call struck at K, 
and (K, — K,)~! short positions in a call struck at K,, with K, < K,: 


1 if S;>K 


0(Sr— K) = ; 
Gy ) 0 otherwise 


(1.236) 


The bear digital (or digital put) obtains similarly by considering the limiting case of 
the bear spread, and the pay-off is 0(K — S7) = 1 — 0(S; — K), giving 1 if S} < K and 
zero otherwise. 

Wingspreads (also called Condors) consist of two long and two short positions in 
calls. These are similar to butterfly spreads, except the body of the payoff function 
has a flat maximum instead of a vertex; in formulas, the payoff function is 





(Sr K), (Sr Ky), (Sr K3), + (Sr — Ky),, (1.237) 





with K, < K, < K, < K, and K, — K, = K,— K3}. 

Straddles involve the simultaneous purchase or sale of an equivalent number of calls 
and puts on the same underlying with the same strike and same expiration. The straddle 
buyer speculates that the realized volatility up to the option’s maturity will be large 
and cause large deviations for the price of the underlying asset. The pay-off is 


(Sr— K), +(K — SG, (1.238) 


Strangles are similar to straddles, except the call is struck at a different level than the 
put; i.e., 


(Sr— Kı), + (Ky — Sp),, (1.239) 


with K, > K, or K, < K,. The case K, < K, is an in-the-money strangle, and K, > K, 
is an out-of-the-money strangle, since the minimum payoff values attained are K, — K, 
and zero, respectively. 

Calendar spreads are spread options where the expiration dates are different and the 
strike prices are the same, for example: 


(Sn =K); — (Sr, —K) 4, (1.240) 


with T, 4 T,. This option strategy is added here for completeness, although it differs 
from all of the foregoing because the portfolio involves options of varying expiry dates. 


Consider the problem of replicating a generic payoff function (S), 0 < S < oo, assumed 
throughout to be twice differentiable. By virtue of equation (1.229), one can achieve repli- 
cation by means of positions in infinitely narrow butterfly spreads of all possible strikes. 
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A perhaps more instructive replication strategy involves positions in the underlying stock, 
a zero-coupon bond and European call options, of all possible strikes and fixed expiration 
time T. Assuming #(0) #’(0) exist, the formula is 


f(S) = (0) +4 (0)S-+ f KCS, K)dK. (1.241) 


n(K)dK represents the size of the position in the call of strike K. The function n(S) is related 
to the payoff function and can be evaluated by differentiating equation (1.241) twice: 


$"(S) = [ n(K)8(S— K)dK = n(S). (1.242) 


Here we make use of the identity 


0? 


82 
ag TO= za 78S -K). (1.243) 


As shown in Problem 3 of this section, equation (1.241) can be derived via an integration- 
by-parts procedure. The conclusion we can draw is that if calls of all strikes are available, the 
arbitrage-free price fy = fo(So, T) at time t = 0 of a contingent European claim with payoff 
(Sr) at maturity t = T is 


fo = H(O)Zy(T) + 8 OS f KCS K TAK. (1.244) 


Besides the basic assumption that asset prices satisfy equation (1.205), it is crucial to 
point out that the foregoing replication formulas follow without any assumption on the model 
of the underlying stock motion; i.e., the replication equations are also true by assuming a 
stochastic process of a more general form that includes the lognormal model as a special case. 
Moreover, these equations can be extended to apply to a payoff (S) defined on a region 
S € [So, S1], where Sj, Sı may be taken as either finite or infinite. Specifically, let us consider 
the space [So, S,], then, using the delta function integration property’ and assuming (59), 
d’ (So) exist, one can derive 


Sı 
P(S) = P(So) + $ (So) (S — So) +f $"(K)(S— K), dK. (1.245) 


The discretized form of this formula reads 
N 
PCS) © (So) + P (So) (S — So) + (AK) G'(K)(S — Ki) 4 (1.246) 
i=1 


where K;, are chosen as Sy < K, < K, < --- < Ky < Sı. Let us assume that the strikes are 
chosen as equally spaced, AK; = K; — K,;_; = AK. Hence, the replication consists of a cash 
position of size (So) — p'(So)So, a stock position of size /(Sy), and N call positions of 
size (AK;)@’(K;) in calls struck at K;. In most practical cases, this formula actually offers a 
more accurate discrete representation than the analogous form obtained from discretizing the 
integral in equation (1.241). This is especially the case when considering a pay-off whose 
nonzero values are localized to a region [S,, S,] for finite $4 or to a region [Sp, oo), with Sọ > 0. 


'4Here one uses the general property i 6(S — K)¢(K)dK = (S) for any real constants B, n > 0. 
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This is the situation for pay-offs of the general form A(S, X)1,,, for some function A(S, X) 
with strike X > 0. Here 1, is the indicator function having nonzero value only if condition A 
is satisfied. If A is chosen as the condition S > X, then 1s. y = 0(S — X). The plain European 
call pay-off obtains with the obvious choice A(S, X) = S — X. It should also be noted that 
an alternate replication formula involving puts at various strikes (instead of calls) is readily 
obtained in a manner similar as before or by a simple application of put-call parity (see 
Problem 6), giving 


P(S) = (S1) + $'(S,)(S— s)+ f "(KK — S), dK, (1.247) 


assuming that (S1), 6’(S,) exist. 

Note that these formulas assume that the payoff function is well behaved at either the 
lower endpoint or the upper endpoint. A formula that is valid irrespective of whether the 
payoff function is singular at either endpoint can be obtained by subdividing the interval 
[So, S,] into two regions: a lower region [So, S] and an upper region [S, S,] for any S with 
So < S < S,. In the lower region we use puts, while calls are used for the upper region. In 
particular, via a straightforward integration-by-parts procedure one can derive (see Problem 7) 


z z = 5 Sı 
$(S) = ZORKO ES f "(K)(K - S), dK+ Í p'(K)(S—K), dK. (1.248) 


One is then at liberty to choose S, which acts as a kind of separation boundary for whether calls 
or puts are used. Note that in the limit $ > Sọ the formula reduces to that in equation (1.245), 
with only calls being used, while the opposing limit $ > S ı gives equation (1.247), with only 
puts used for replication. A similar approximate discretization scheme as discussed earlier 
may be used for these integrals, giving rise to a replication in terms of a finite number of calls 
and puts at appropriate strikes. This last formula may hence prove advantageous in practice 
when liquidity issues are present. In particular, this replication can be exploited to better 
balance the use of available market contracts that are either in-the-money or out-of-the-money 
puts or calls. 
We now give some examples of applications of the foregoing replication theory. 


Example 1. Exponential Pay-Off. 


As a first example, let 


[e5], S>X 


(S) = (e* *— 1), = [e** — 1]0(S — X) = (1.249) 
0, S< X. 


One can readily verify that this payoff function can be exactly replicated using the right- 
hand side of either equation (1.241) or equation (1.245) with S; = œ. Using (X) = 0, 
o'(K) = 6" (K) = e** (for K > X), and adopting the replication formula in equation (1.246) 
with Sọ = X and any S; > X gives 


#(S) ~§-X+ 3 w(S—K),, (1.250) 


i=1 


with call positions (i.e., weights) w; = (AK)e“~* and strikes K; = X +i AK. Note that one 
may also use slightly different subdivisions, all of which converge to the same result in the 
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FIGURE 1.4 Rapid convergence of the static replication of the exponential pay-off defined in equa- 
tion (1.249) (in the region [X, X + L] with X = 10, L = 3) using equation (1.250) with a sum of (a) two 
calls with K, = 10.75, K, = 12.25 versus (b) four calls with K, = 10.375, K, = 11.125, K, = 11.875, 
K, = 12.625. 


limit of infinitesimal spacing AK — 0. Figure 1.4 partly shows the result of this replication 
strategy in practice. Nearly exact replication is already achieved with only eight strikes. 


Example 2. Sinusoidal Pay-Off. 
Consider the sinusoidal pay-off 


a(S —X) 


(5) =sin( T Jisse X,L>0. (1.251) 
The choice of strikes K; = X +iL/N, i=1,...,N, with Sọ = X and S, = X + L, within 


equation (1.246) gives 
5 N 
$(S) © = (S—X) +) wS- Ki), (1.252) 
i=l 


where w; = —(7?/NL) sin(im/N). Figure 1.5 shows the convergence using this replication 
strategy. 
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FIGURE 1.5 A comparison of three replication curves and the exact sine pay-off defined in equa- 
tion (1.251) (in the region [X, X +L] with X = 10, L = 3) with N = 4, N = 8, and N = 12 short calls, a 
long position in the stock, and a short cash position using equation (1.252). With N = 12 the replication 
is already very accurate. 
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Example 3. Finite Number of Market Strikes. 


In realistic applications there typically is only a select number of strikes available in the 
market, so the trader has no control over the values of K; to be used in the replication 
strategy. In this situation the set of calls (puts) with strikes K;, i= 1,...,N, is already 
given (i.e., preassigned) for some fixed N, and the spacing between strikes is not necessarily 
uniform. A solution to this problem is to consider a slight variation to equation (1.246) and 
write the finite expansion 


N 
P(S) © wi tws +Y w(S—K;),. (1.253) 
i=l 
The coefficient w_, gives the cash position, while the weight wọ gives the stock position, 
and the weights w; give the positions in the calls struck at values K;. The goal is to find the 
positions w; providing the best fit, in the linear least squares sense, as follows. By subdividing 
the stock price space [Sp, S,] into M interval slices SY, with S® < SOD, j=1,...,M, 
the N+ 2 positions w; can be determined by matching the approximate payoff function on 
the right-hand side of equation (1.253) to the value of the exact payoff function ¢(S”) at 
these M stock points. This leads to a linear system of M equations in the N +2 unknown 
weights w;: 


N 
pS) =w +S? +> w(S9-K),,  j=1,...,M. (1.254) 
i=l 

One can always make the choice M > N +2 so that there are at least as many equations as 
unknown weights. A solution to this system can be found within the linear least squares sense, 
giving the w;. This technique is fairly robust and also offers a rapidly convergent replication. 
The reader interested in gaining further experience with the actual numerical implementation 
of this procedure as applied to logarithmic pay-offs is referred to the numerical project in 
Part II of this book dealing specifically with the replication of the static component of variance 

swap contracts. 


Problems 


Problem 1. A particular representation of the Dirac delta function 6(x) is given by the limit 
e — 0 of the sequence of functions f,(x) = (1/e?)(€—|x|),.. Using this fact, demonstrate 
that the butterfly spread pay-off defined in equation (1.228) gives the Dirac delta function 
6(S; — K) in the limit € > 0. 


Problem 2. Consider the bull spread portfolio with maximum pay-off normalized to unity: 
Cr(S, K +€) —C;(S, K) 
z , 


C;(S, K) = (S—K),. Compute the limit e — 0 and thereby obtain the pay-off of a bull digital. 





(1.255) 


Problem 3. Show that under suitable assumptions on the function @ [i.e., @(0) and ¢’(0) 
exist] we have 


J. PES- K), dk = 4(5)- H'OS- 4). (1.256) 


hence verifying equation (1.241). For this purpose use integration by parts twice, together 
with the property in equation (1.243) as well as the identity 


a 
3g SK) = (SK), (1.257) 
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where 0(x) is the Heaviside unit step function having value 1, or 0 for x > 0, or x <0, 
respectively. Note that the derivative of this function gives the Dirac delta function. 


Problem 4. Demonstrate explicitly that the pay-offs of Examples 1 and 2 of this section 
satisfy equation (1.245) with Sy = X, S =X+L,L>0. 


Problem 5. Assume that calls of all strikes are available for trade and have a known price. 


Express the present value of the log payoff (S+) = log “re , with constant a > 0, in terms of 


call option prices of all strikes K > 0. Find a similar expression in terms of put option prices. 





Problem 6. Apply equation (1.241) to a call payoff (S) = (S— X),, with constant X, to 
obtain the put-call parity relation 


(S-X),=S5-X+(X-S),, (1.258) 


for all S > 0. In deriving this result, the property in equation (1.243) is useful. Now make 
use of the right-hand side of this put-call parity formula into equation (1.245) and integrate 
by parts to arrive at equation (1.247). 


Problem 7. Consider the interval S € [S, S,]. Integrate by parts twice while using the general 
properties stated earlier for the functions 6(x), (x),, and the delta function 5(x) to arrive at 
the identities 


S SEE oo os 
Í "(K)(K —S),dK = (S)15, <s<3 — P(SJOCS — S) + $'(S)(S— S)+ (1.259) 
and 


| P" (KI(S — K) dK = $(S)lszs<s, — HES- 5) - 6 OS- 5), (1.260) 


where 1, is the indicator function having unit value for the domain D and zero otherwise. 
Add these two expressions to finally obtain equation (1.248). 


Problem 8. Using risk-neutral valuation, i.e., equation (1.166), derive the Black-Scholes 
pricing formula for the price of a European digital call and that of a digital put struck at K with 
time to maturity 7. For simplicity assume geometric Brownian motion with constant interest 
rate and volatility. Interpret the meaning of the digital option prices in terms of the price of a 
standard call. Hint: The derivation of the European digital call boils down to computing the 
risk-neutral probability P(S; > K), where the algebraic steps are similar to what is used to 
derive a standard call price. 


Problem 9. Derive the Greeks A, I’, and vega for a European digital call. 


1.9 Continuous-Time Financial Models 


In this section, we introduce the basic concepts in continuous-time finance. Derivative claims 
are structured as contracts written on underlying assets that can be used as hedging instru- 
ments. An elegant mathematical structure underlying these financial concepts is reviewed in 
this section. 

In perfect-markets models, a basic asset price process is given by a money-market account 
on which we can deposit and out of which we can borrow without limits. The value at time 
t of one dollar deposited in a money-market account at initial time t = 0 with continuously 
compounded interest up to time f, is denoted by B,. 
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Definition 1.10. Money-Market Account. Assuming continuous compounding, a money- 
market account is an asset price process B, that is monotonically increasing in time, has zero 
volatility, and follows an equation of the form 


dB, =r,B, dt, (1.261) 


where r, is a stochastic process that is positive at all times.” By integrating equation (1.261) 
we find the stochastic integral representation 


B, = elo 45, (1.262) 


The instantaneous rate (or short rate) r, is assumed positive at all times. This is a way 
to implicitly account for an important restriction: If interest rates were negative, an arbitrage 
strategy would be to borrow money at negative interest and hold the cash in a safety deposit 
instead of in an interest-bearing account. Assuming that security costs to store money in a 
safety deposit are negligible, the existence of such a strategy constrains interest rates to stay 
positive. 


Definition 1.11. Financial Model: Continuous Time. A continuous-time financial model 


M = (F,, Aj,...,A”) is given by a filtration F, and n price processes as basic hedging 
instruments: 


(A!,..., A), tER,. (1.263) 


The value Ai, can be used to model the current (or spot) price of the ith asset if current time 
is set as t = 0 and the random variable A! models the price of the ith asset at any time t > 0. 


Definition 1.12. Diffusion Pricing Model. In a diffusion model the price processes of all 
hedging instruments (or securities) obey stochastic differential equations of the form 





dA; Ai = Al a 
ee dt+) o4, dWe. (1.264) 
t a=1 
Here, the dW7,a=1,...,M, are independent Brownian motions (or Wiener processes) 


with E[dW,;] = 0 and E[dW? dw? | = ôg dt. The functions of, are so-called lognormal 
volatilities of the ith asset price process (A!) >0 with respect to the ath Brownian motion 
(i.e., with respect to the ath risk factor), and the functions u“ are lognormal drifts of the ith 
asset price process. These are generally functions of the asset values Al, . . . , A" and time t. 


Note: We can assume further that one of the assets, e.g., Al, is the money-market account, 
which is the only asset characterized by having zero volatility; in this case a = 0 for all 
a=1,...,M. 


Definition 1.13. Adapted Process. A stochastic process é, is adapted to the filtration F, if 
é, is a random variable in the probability space generated by F, In other words, the value 
of £, depends only on the values taken by the paths (A},..., A”) for 0 < s < t, as they were 
realized up to time t, i.e., &, is ¥,-measurable. 


15 Technically, B, is of zero quadratic variation because the differential contains no term with dW,; however, r, 
can generally be stochastic. 
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Definition 1.14. Stopping Time. A stopping time 7 € (0, T], for any finite time T, is an 
F,-measurable positive random variable such that the time event {t = T}, with probability 
P(T < 00) = 1, corresponds to a decision to stop and is determined entirely by the information 
set F, up to time t = T. That is, given the filtration F, we know whether or not T < t. 


Note that for asset-pricing purposes the information set F, basically derives from the set 
of all asset price paths (A/,..., A”), 0 < t < 7. This rather technical definition and abstract 
concept of a stopping time is best illustrated with examples. For instance, let x, be some 
real-valued diffusion process (e.g., a Wiener process) and let [a,b] C R be a given fixed 
finite interval. Assume initially x) ¢ [a, b] at time t = 0 and allow the process to evolve in 
time t > 0 up to time T. The random variable defined by 


— min{t; such that x, € [a, b]}, if n T (1.265) 
T, otherwise 


is then a stopping time and corresponds to the first entry time t < T of the process x, into 
the interval /a,b]. Some basic useful properties of stopping times follow readily, such as 
additivity: If 7, and T, are two stopping times in a given time interval, then T = T, + T, is also 
a stopping time and, moreover, min(7,,7,) and max(7,, 7,) are also stopping times. In the 
pricing of European-style options the expiration time is an example of a stopping time that 
is actually known at contract inception. In contrast, for American-style options the expiration 
period (or lifetime of the contract) is still finite, yet there is the added freedom of early 
exercise. As we shall see in Section 1.14, the early-exercise time is actually an example of an 
optimal stopping time that is (dynamically) determined by the level of the asset or stock price 
at the time of early exercise. Other examples of stopping times and derivative instruments 
are given by barrier contracts, for which the pay-off depends on whether or not a certain 
price process crosses a given barrier in the future. Suppose H is a fixed number, and define 
T as the time tf = 7 at which A, = H for the first time, subject to the initial condition Ap. 
Then 7 is a stopping time. Cash flows for barrier options can occur at the time the barrier 
is crossed or at maturity. A counterexample to a stopping time is the time 7’, defined as the 
last time before a given maturity date T for which A, = H. 7’ is not a stopping time because 
knowledge about when 7’ occurs requires information on the full path x, for all t € [0, T] and 
in particular for times after 7’ itself. 


Definition 1.15. Derivative instrument.! A derivative instrument, or contingent claim, is 
a contractual agreement between two parties who agree to exchange a cash flow stream in 
the future, where the cash flow amounts are adapted processes and the timings are stopping 
times in the given financial model. A discrete cash flow stream is modeled by a sequence of 
pairs (7;,¢;), j=1,...,m, where the T, are stopping times and the c, are cash flow amounts 
depending on the price processes (A;,..., A”) up to time Tj. Continuous cash flow streams 
are modeled by more general adapted processes y, such that dy, is the cash flow occurring 
in the time interval |t, t+ dt). In the particular case of a discrete cash flow stream (7;, c;), 


Tp = Tise Tpi, the continuous-time representation c, is given by 
t m 
[ dy, =c; (1.266) 
j=l 


16Tt should be clearly understood that we are throughout assuming all claims or assets are nondefaultable; e.g., the 
money-market account is assumed nondefaultable. The definition must be modified in the case of defaultable 
(credit) derivatives, where pricing depends on time of default and recovery, quantities not directly observable from 
market-traded instruments. 
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An example of a continuous cash flow stream is given by exchange-traded futures and 
options contracts. These contracts have the same final pay-off as forward and ordinary option 
contracts. However, to reduce credit risk to a minimum, exchanges ask investors to hold a 
margin account and mark-to-market gains and losses on a daily basis based on realized prices 
or to unwind the position. This results in a daily stream of cash flows that can be modeled as 
continuous. 


Definition 1.16. Self-Financing Trading Strategy. A self-financing trading strategy in the 
hedging instruments A}, . . . , A” is a zero cash flow-replicating strategy for all time t € [0, T]. 
That is, this strategy consists of a portfolio of positions & in the assets AŻ, with value 
V, = XL, éiAi, where the &', i=1,...,n, are adapted processes such that at all times 
t € [0, T] we have 


Dai +dAi)d€, =0. (1.267) 


i=1 


The meaning of the self-financing condition is that the cash flow dy, resulting at time 
t+ dt are reinvested in the underlying assets by adjusting the positions é! ;, by purchasing or 
selling the corresponding hedging instruments at the prices A, + dA; at an infinitesimally later 
time f+ dt (i.e., positions are readjusted only after the prices have changed during time df). In 
this sense the positions are adapted, i.e., nonanticipative with respect to the stochastic changes 
in the asset prices. The infinitesimal change in the portfolio value V, of a self-financing 
strategy is only due to changes in the prices of the underlying instruments since there are no 
allowed additional cash inflows or outflows after initial time; hence,!” 


dV, =) ëdA'. (1.268) 
i=l 
In integral form this is written as 
n t 
V=V+>f gaai. (1.269) 
i=1 “0 


Using Itô’s lemma, the change in portfolio value, dV, = V,a, — V,, must also satisfy 


dV, =} [éidA} + Aidé! + (dé!) (dAi)]. (1.270) 
i=l 
Equating these two expressions then gives the self-financing condition rewritten in the form 
contained in equation (1.267). 


Definition 1.17. Self-Financing Replicating Strategy. A self-financing replicating strategy 
(or perfect hedge) in the hedging instruments Aj,..., A" that replicates a given cash flow 
stream dy, where y, is a given contingent claim at time t in some time interval t € [0, T], 
is defined as a family of adapted processes £, i=1,...,n, such that at all times t € [0, T] 
we have 


de t . . 
Y= w+> | é; dAʻ, (1.271) 
i=1 


'7Note: We assume throughout that the assets do not pay dividends, although in the case of dividends the 
appropriate formulas extend in a simple manner. 
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or, equivalently in differential form, 


dy, =} € dAi. (1.272) 


i=1 


In the case of a European-style option with payoff @(S,) at time T, where S, is the 
underlying stock price process, a self-financing replication strategy in the stock and the 
money-market account, with value é1B, + &?S, at time t, would satisfy 


B, dé} + (S,+dS,)dé? = 0 (1.273) 


for all times ¢ € [0, T). [Note that the term dB, = r,B, dt vanishes since it gives rise to a term 
of O((dt)dé}), i.e., of order greater than dt.] At time T, the position is unwound so that the 
payout $(S,) [i.e., yr = (Sr) in this case] is generated; i.e., the portfolio has terminal value 


E,Br + &7Sr = P(Sr). (1.274) 


In the case of a barrier or American option, where the payout occurs at a stopping time 
0<7<T, the equation (1.273) is valid until time 7, at which point we have 


BE +5,€2 = 6(S,). (1.275) 


One of the main problems in pricing theory is whether or not the cash flow streams associ- 
ated with a contingent claim can be replicated by means of a self-financing trading strategy. If 
a self-financing trading strategy exists and reproduces all the cash flows of a given contingent 
claim, then the present value of the cash flow stream can (uniquely in case of no arbitrage) 
be identified as the cost of setting up the self-financing trading strategy. The question of 
whether such a self-financing strategy exists relates to attainability and market completeness. 

The practical implementation of trading strategies is limited by the existence of transaction 
costs, by liquidity effects, which pose restrictions on the amounts of a given instrument that 
can be traded at the posted price, and by the delays with which information reaches market 
participants. To a first approximation, these effects can be taken into account implicitly by 
assuming that there are no imperfections. A key role is played by the condition of absence 
of arbitrage, which is stated next and which implies that all portfolios with the same payoff 
structure have the same price. Asking for absence of arbitrage is a way of accounting for 
finite market liquidity since, in fact, if an asset had two different prices, trades to exploit the 
opportunity would cause the prices to realign. 


Definition 1.18. Arbitrage: Continuous Time. The self-financing trading strategy (€',..., 
é"), 0<t<T, in the hedging assets (A),..., A") is an arbitrage strategy if either of the 
following two conditions holds. 
A1. The portfolio value process 


v= ëA (1.276) 
i=1 


is such that Vo < 0 and with probability P(V; > 0) = 1. 
A2. The value process V, is such that V) =0 and P(V; > 0) > 0 with P(V, > 0) = 1 for all 
t € [0, T]. 


In plain language, condition A2 says that an arbitrage opportunity is a self-financed 
strategy that can generate a profit at zero cost and with no possibility of a loss at any time 
during the strategy. 
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Typically, when solving the replication problem for a cash flow stream, the current price 
of the stream is not known, a priori. Knowledge of the cash flow stream, however, is sufficient 
because if a trading strategy replicates the cash flows, in virtue of the hypothesis of absence 
of arbitrage, the value of this strategy at all times yields the price or value process V,. Next 
we consider a couple of examples of replication (or hedging) strategies. One is static in time; 
the other is dynamic. 


Example 1. Perpetual Double Barrier Option. 


Suppose there are no carry costs such as interest rates or dividends for holding a posi- 
tion in the stock. Consider a perpetual option with two barriers: a lower barrier at stock 
value L and an upper barrier at H, with L < H. If the stock price touches the lower barrier 
before it touches the upper barrier, the holder receives R, dollars and the contract termi- 
nates. Otherwise, whenever the upper barrier is hit first, the holder receives Ry dollars and 
the contract terminates. The problem is to find the price and a hedging strategy for this 
contract. 

To solve this problem, let 7, be the stopping time for hitting the lower barrier and Ty 
be the stopping time for hitting the higher barrier. The stopping time r at which the option 
expires is the minimum of these times, 


T =min(7,, Ty). (1.277) 


If one considers a replicating portfolio f, = aS, + b at any time ¢, then the barrier levels give 
rise to two equations: 


aH+b=Ry,  aL+b=R,, (1.278) 


corresponding to the portfolio value (i.e., payout) for hitting either barrier. The value f, of 
the perpetual double barrier contract evaluated at the stopping time t = T is 


f, = aS, +b. (1.279) 


Solving the system in equation (1.278) for the portfolio weights a and b, we find that 


Ry,—R 
ag? at fe b= R,— all. (1.280) 
Absence of arbitrage therefore implies that the price process followed by f, is given by the 
value of the portfolio aS, + b that replicates the cash flows. 


Example 2. Dynamic Hedging in the Black-Scholes Model 


Consider the Black-Scholes model with a stock price following geometric Brownian motion, 


Ou dt+o dW,. (1.281) 
t 
In this model, the price at time t of a call struck at K and maturity at calendar time T > t 
is given by the function C,;(S,, K, T — t, ø, r) in equation (1.217). Let’s assume that in this 
economy interest rates are constant and equal to r. 
One can show that the pay-off of the call can be replicated by means of a self-financing 
trading strategy that costs C, = Cgs(S,, K, T — t, o,r) to set up at calendar time t. This 


1.10 
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strategy involves two adapted processes a, and b, for the hedge ratios that give the positions 
at calendar time ¢ in two assets: the stock of price S, and a zero-coupon bond maturing at 
time T of price Z,(T) = e~"7—”. Namely, 


C, =a,S,+b,Z,(T). (1.282) 


To show this, we need to find the two processes a, and b,. Let us note that self-financing 
condition (1.267) in this case reads 


(S,+dS,)da,+Z,(T)db, =0. (1.283) 
By the differential of equation (1.282) and using the self-financing condition we find 
dC, = a, dS,+rb,Z,(T)dt. (1.284) 


On the other hand, applying It6’s lemma (in one dimension) to the price process C, (considered 
as function of t and S,) we find 


ôC SSC dC 
ac,=( BS 0 )ar 55 iS 





ôt 2 S S 


where S = S,. By equating coefficients in dt and dS, with the previous equation we find 








ICs 
= 1.285 
a= (1.285) 
and 
ôCgs °S? 0? Cps 
b,Z (T) = : 1.286 
rb, ZT) = 28.4 7 (1.286) 
Solving for b, from replication equation (1.282) gives 
b,=Z,(T) '(C,—a,8,). (1.287) 


Substituting b, as given by equation (1.287), as well as a, from equation (1.285) into 
equation (1.286), we arrive at the Black-Scholes partial differential equation in current time 
t and spot price S = S;: 


ICas  _„ôCps °S? PCs 
s 
a as 1 2 os 





rCps =0 (1.288) 


This is precisely the equation satisfied by the function Cgs(S,, K, T — t, ø, r) given by equa- 
tion (1.217) with T > T —t. 

Notice that the parameter u in the equation for the stock price process (1.281) appears in 
neither the Black-Scholes formula, the Black-Scholes equation, nor the hedge ratios a, and 
b,. Section 1.10 provides a more general explanation of this very notable simplification. 


Dynamic Hedging and Derivative Asset Pricing 
in Continuous Time 


In this section, we present the main theorem for pricing derivative assets within the continuous- 
time framework. 
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Theorem 1.4. Fundamental Theorem of Asset Pricing (Continuous-Time Case). Part I. 
Consider a diffusion continuous-time financial model M = (F,, A;,..., A"), where the hedg- 
ing instruments are assumed to satisfy a diffusion equation of the form (1.264), i.e., 


dA 
Ai 


t 


_ wars Sot, dWw*, i=1,...,n, (1.289) 





where dW; are understood to be standard Brownian increments with respect to a specified 
probability measure. Also, suppose there exists a money-market account B, with 


dB, =r,B, dt. (1.290) 


Finally, suppose there are no arbitrage opportunities. Then: 

(i) Under all equivalent probability measures, there exists a family of adapted processes 
da: &= 1, ..., M (one for each risk factor), such that, for any aysel price process A, obeying 
an equation similar to equation (1.289) with drift uê and volatilities ot p» the drift term is 
linked to the corresponding volatilities by the equation 


M 
i= r+) dat Far (1.291) 


a=1 
where qa, are independent of the asset A in question. 


In finance parlance, the adapted processes qa , are known as the price of risk for the ath 
risk factor (or ath Brownian motion). Note that this result ss to any asset obeying a 
diffusion process: In particular, the drifts u“ and volatilities oĉ, of the base asset prices A’ 
are themselves also linked by an equation similar to equation (1. 291), with q, independent 
of the prices A‘. 


Definition 1.19. Numeraire Asset. Any asset g, whose price process is positive, in the sense 
that g, > 0 for all t, is chosen as the numeraire for pricing. That is, g, is an asset price 
relative to which the value of all other assets A, are expressed using the ratio A 

t 


Theorem 1.5. Fundamental Theorem of Asset Pricing (Continuous-Time Case). Part II. 
Under the hypotheses in Part I of the theorem, we have the following: (ii) If g, is a numeraire 
asset, then there exists a probability measure Q(g) for which the price A, at time t of any 
attainable instrument without cash flows up to a stopping time T > t is given by the martingale 
condition 


A, A, 
a = p20 £ l (1.292) 
&1 Er 


Under the measure Q(g) the prices of risk in equation (1.291) for the ath factors are given 
by the volatilities of g, for the corresponding ath factors: 


gi, = 08. (1.293) 


Note that we are throughout assuming that the contingent claim or derivative instrument 
to be priced is attainable, meaning that one can find a self-financing replicating strategy that 
exactly replicates the cash flows of the claim. If one also assumes that the financial model 
satisfies market completeness, then every contingent claim or cash flow stream is assumed 
attainable. 
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Definition 1.20. Pricing Measure: Continuous Time. Given a numeraire asset price pro- 
cess g,, the pricing measure associated with g is the martingale measure Q(g) for which 
pricing formula (1.292) holds for any asset price process A,. 


Definition 1.21. Risk-Neutral Measure. Assuming continuous compounding, the risk- 

neutral measure Q(B) is the martingale measure with the money-market account as numeraire 
t 

asset g, = B, = elo rds 


Theorem 1.6. Fundamental Theorem of Asset Pricing (Continuous-Time Case). Part III. 
Under the hypotheses in Part I of the theorem, we have the following: (iii) Under the risk- 
neutral measure Q(B) all the components of the price-of-risk vector, qj, @=1,...,M, 
vanish, and the drift uê of any asset price A, at time t is equal to the riskless rate r, The 
price process for any attainable instrument without cash flows up to any stopping time T > t 
is given by the expectation at time t: 


A, = E2” [er aa (1.294) 


(iv) Any attainable price process A, can be replicated by means of a self-financing trading 
strategy with portfolio value V, = OBEE © Ai in the base assets Ai and in the 
money-market account B,: 


dA, = dV, = 6 r,B, dt +200 dAi, (1.295) 


i=1 


where the positions €, 9 satisfy the self-financing condition 


B, do + (Ai + dAi)ag? =0. (1.296) 
i=l 
Proof. 
(i). Assume no arbitrage and consider a self-financing trading strategy, with components 
ee Serer go as adapted positions in the family of base assets A/,..., A”. Then 
P (Ai + dAd =0 (1.297) 


i=1 


holds. This strategy has portfolio value at time t given by 


m, = gai. (1.298) 


i=1 


This strategy is instantaneously riskless if the stochastic component is zero, i.e., dII, = r,II, dt. 
Given our assumptions, a riskless strategy exists and can be explicitly constructed as follows. 
Using the self-financing condition in equation (1.297) and It6’s lemma for the stochastic 
differential dII, we obtain the infinitesimal change in portfolio value in time [t, t+ dt): 


dll, = Y [(Ai + daidg? + Pda’) = ddl. (1.299) 


i=1 i=1 
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Due to the assumption of no arbitrage, the rate of return on this portfolio over the period [f, t+ 
dt) must equal the riskless rate of return on the money-market account, i.e., dH, = r,II, dt.'8 
Substituting equation (1.289) into the foregoing stochastic differential and setting the coeffi- 
cients in all the stochastic terms dW; to zero gives 


Zot PA=, (1.300) 


i=1 


for alla = 1, , M. Here the functions oë, are volatilities in the ath factor for each asset 
A‘. This Sion states that the IR”-dimensional vector of components ¢; Mai is orthogonal 
to the subspace of M vectors (labeled by a=1,...,M) in R” having components o; o$ 4s 
Slyn. 

Absence of arbitrage also implies that the portfolio earns a risk-free rate, dII, = r,II,dt; 
hence, setting the drift coefficient in the stochastic differential dII, to r,II, while using 
equation (1.298) gives this additional condition: 


Diu -r)i ai =0. (1.301) 
i=1 


Here, the quantities u“ are drifts for each ith asset. Hence equation (1.300) must imply 
equation (1.301) for all arbitrage-free strategies satisfying the self- ee condition. Equa- 
tion (1.301) states that the IR"-dimensional vector of components g! A ai must be orthogonal 
to the R”-dimensional vector with components (u4 — r,). This means that if the vector with 
components ¢, x ) Ai i is orthogonal to the M vectors of components a, ,, then it is also orthogonal 
to the vector of components (uë Tra). From linear algebra we know that this is possible if 


ee 


and only if the vector of components (u —r,) is a linear combination of the M vectors of 
components of, (i.e., is contained in the linear subspace spanned by the M vectors). Hence 
for any given time t, we have 


M 
ME =r +Y daa, (1.302) 

a=1 
with coefficients g, , independent of the asset A', for alli=1,..., 1. Since this is true for all 


self-financing strategies and choices of base assets, this implies that the same relation must 
follow for any asset A,; namely, equation (1.291) obtains. 

(ii) Let g be a numeraire asset. The measure Q(g) is specified by the condition in equa- 
tion (1.292). At this point we make use of a previously derived result contained in equa- 
tion (1.138). Applying that formula now to the quotient A,/g,, where A, satisfies an equation of 
the form (1.264) (with A! replaced by A) and the numeraire asset g, satisfies a similar equation, 


dg, 
—! = us dt+ 2 of, dWe, (1.303) 


t 


18 4 simple argument shows that if the portfolio return is greater than r,, then an arbitrage strategy exists by 
borrowing money at the lower rate r, at time ft and investing in the portfolio until time t+ dt. On the other hand, if 
the portfolio return is less than r,, then an arbitrage strategy also exists by short-selling the portfolio at time ¢ and 
investing the earnings in the money-market account. Both strategies yield a zero-cost profit. 
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immediately gives the drift component: 





A, A, A g i A 
E, F = 7 u — ps X og (ogo, dt (1.304) 
t 


t a=1 


E Elan- j -o$ dt. (1.305) 


In the last equation we have ied eon (1. a for both g, and A,. In order for the ratio 
A,/g, to be a martingale process for all (arbitrary) choices of the asset A,, this expectation 
must be zero. This is the case if and only if the process for the price of risk g* is related to 
the numeraire asset g,, dq, = a,1, aS follows: 


Gi = OF, a=1,...,M. (1.306) 


That is, the prices of risk qř are equal to the volatilities of the numeraire asset for each 
respective risk factor. 

(iii) This is a particular case of (ii) and follows when money-market account B, is chosen as 
numeraire asset. Since dB, = r,B, dt, the prices of risk in this case are all zero, i.e., qË, =U; 
and therefore 44 = r, for all asset price processes A,. In particular, we have that 


B ; 
A, = ELP [a = ELP [Ae g (1.307) 


giving the result. Here we have used the fact that B, at time ż¢ is a known (i.e., nonstochastic) 
quantity that can be taken inside the expectation. 

(iv). Consider the trading strategy with positions g ; in the base assets A‘. A long position 
in this trading strategy and a short position in the generic asset A, is a riskless combination 
that accrues at the risk-free rate. By adjusting the position in the money-market account z0 
so that the trading strategy has the same value of asset Ap at initial time t = 0, the resulting 
trading strategy will track the price process A, for all times. This trading strategy is also 
self-financing. In fact 


dA,= a(s, +> Wai) 


i=1 


=¿0r B, dt+B, do + (Ai + dA) de® + dA’). (1.308) 


i=l 
Hence equation (1.295) obtains from equation (1.296). O 


In summary, we observe that the asset pricing theorem is connected to the evaluation of 
conditional expectations of martingales (i.e., relative asset price processes) within a filtered 
probability space and under a choice of an equivalent probability measure (also called an 
equivalent martingale measure). A measure is specified by the chosen numeraire asset g 
obeying a stochastic price process of its own, given by equation (1.303). Given a numeraire 
g, the relative asset price process A,/g,, for a generic asset price A,, is a martingale under 
the corresponding measure Q(g). Equivalent martingale measures then arise by considering 
different choices of numeraire assets. In particular, consider another numeraire asset, denoted 
by g, with price process g,, and suppose that measure Q(g) is equivalent to Q(g), then prices 
computed under any two equivalent measures must be equal: 


A =| A 
A,= 62| =] = aroj =|. (1.309) 
ET ET 
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Rearranging terms gives 





520) [2] — & g2® E] 


ET & Er 
: 2, A 
= geo elkér, (1.310) 
81/8: 8r 


Note that this holds true for an arbitrary random variable X, = A,/g,. We hence obtain the 
general property under two equivalent measures: 


where p, = g,/2,= (#2) , t € [0, T], is a Radon—Nikodym derivative of Q(g) with respect 


to Q(g) (with both measures being restricted to the filtration F,). For t= T we write 
dole) | — 426) 
dQ) z do(g) 
with respect to Q(g). 
Let’s now fix our choice for one of the numeraires; i.e., let g, = B, be the value process 
of the money-market account so that Q(g) = Q(B) is the risk-neutral measure. Taking the 
stochastic differential of the quotient process p, = g,/B, gives 


Choosing X, = | in the foregoing equation shows that p, is also a martingale 


dp, g ʻ 8g a 
m (us -—r)dt+Y of, dWe. (1.312) 
f a=1 


Under the risk-neutral measure with dW,“ as Brownian increments under Q(B), this process 
must be driftless so that we have uf = r,. In particular, this martingale takes the form of an 
exponential martingale, 


1 t t 

p= =exp(-5 | lø$lPas+ f ø$- aW, ); (1.313) 
B, 2 Jo 0 

where ||o8||? = 68-08 = “(08 ,)? and o£- dW, = >”, 08, dW®. At this point we 

can implement the Girsanov theorem for exponential martingales, which tells us that the 

IR™-valued vector increment defined by 


dWé = —o% dt+dW, (1.314) 


is a standard Brownian vector increment under the measure Q(g). In the risk-neutral measure 
the base assets must all drift at the same risk-free rate, 


Ai Lar 
TEn dt+Y of, dw, i=1,...n. (1.315) 


t a=1 





Substituting for dW, using equation (1.314) into this equation and compacting to vector 
notation gives 


dA 
A; 





‘=(r,+0%-04)dt+o"-dWi,  i=1,...,n. (1.316) 


This last equation is therefore entirely consistent with the formulation presented earlier in 
terms of the prices of risk. In particular, equation (1.316) is precisely equation (1.289), 


1.11 
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wherein the Brownian increments are understood to be w.r.t. Q(g), with g, as an arbitrary 
choice of numeraire asset-price process. From equation (1.316) we again see that the vector 
of the prices of risk is q, = o7. In financial terms, each component of q, essentially represents 
the excess return on the risk-free rate (per unit of risk or volatility for the component risk 
factor) required by investors in a fair market. 


Example 1. Perpetual Double Barrier Option — Risk-Neutral Measures. 


Reconsider the case of the perpetual double barrier option with zero interest rates discussed 
previously. The pricing formula for f, is independent of the real-world stock price drift, 
although this drift does in fact affect the real-world probability of hitting one barrier before 
the other. Since interest rates vanish, no discounting is required, and the price process f, has 
the following representation under the risk-neutral measure Q = Q(B): 


f= ES]. (1.317) 


In this case, the price process f, is a martingale under the risk-neutral measure because 
interest rates are zero for all time and the value of the money-market account is constant, i.e., 
unity. Hence the martingale property gives 


f, = Ry Prob®[S, = H|S,]+R, Prob?[S, = L|S,], (1.318) 


where the probabilities are conditional on the current stock price’s value S,. These probabilities 
of hitting either barrier must also sum to unity, 


Prob?[S, = H|S,]+ Prob®[S, = L|S,] = 1. (1.319) 


Note that f, = aS,+b, where a and b are given by equations (1.280). Hence, the probability of 
hitting either barrier under the risk-neutral measure can be found by solving equations (1.318) 
and (1.319). Notice that these probabilities do not depend on the drift of the stock price under 
the real-world measure. 


Problems 


Problem 1. Find explicit expressions for the preceding risk-neutral probabilities P, = 
Prob? [S, = L\S,] and Py = Prob?[S, = A\s,]. Find the limiting expressions for the case that 
H >> L (i.e., H — œ for fixed L). What is the price of the perpetual double barrier for 
this case? 


Hedging with Forwards and Futures 


Let A, be an asset price process for the asset A. A forward contract, with value V, at time t, 
on the underlying asset A (e.g., a stock) is a contingent claim with maturity T and pay-off at 
time T equal to 


Vr = Ar —F, (1.320) 
where F is a fixed amount. According to the fundamental theorem of asset pricing (FTAP), the 


price of this contract at time t < T prior to maturity is equal to A, — FZ,(T), where Z,(T) is 
the value at calendar time ¢ of a zero-coupon (discount) bond maturing at time T. This can be 
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seen in several ways. The first is the following. The payout A, can be replicated by holding 
a position in the asset A at all times, while the cash payment F at time T is equivalent to 
holding a zero-coupon bond of nominal F and maturing at time 7. Alternatively, to assess 
the current price V, of the forward contract using FTAP of Section 1.10, we can evaluate the 
following expectation at time t of the pay-off under the forward measure with g, = Z,(T) as 
numeraire, giving 


V, = Z(T)EL [A, — Fl = A, — FZ,(T). (1.321) 


Here we used the facts that at maturity Z;(T) = 1 and that E22) Az] = A,/Z,(T), 
ELC Wal = F. The equilibrium forward price (at time t), denoted by F,(A, T), is the so- 
called forward price such that the value V, of the forward contract at time f is zero. Setting 
V, = 0 in equation (1.321), we find 





F(A, T) = r (1.322) 


Let’s assume stochastic interest rates, i.e., a diffusion process for the zero-coupon bond 
[satisfying equation (1.349) of Problem 1], as well as diffusion processes for the asset A, 
[satisfying equation (1.348) of Problem 1] and the equilibrium forward price satisfying 


dF,(A, T) F(A,T) F(A,T) 

SS “dt > dW, 1.323 
F, (A, T) Mi + Or t ( ) 
Then a relatively straightforward calculation using It6’s lemma yields the following form for 
the lognormal volatility of the forward price (see Problem 1 of this section): 


7, 


ofr) St Ae = of), (1.324) 
and its drift 
phe” = ph pP — a2 (oA — oF, (1.325) 


where of (D is the lognormal volatility of the zero-coupon bond price and g^ that of the asset. 
We note that the foregoing drift and volatility functions are generally functions of the 
underlying asset price A,, calendar time t, and maturity T. Moreover, these relationships hold 
for any choice of numeraire asset g,. As part of Problem 1 of this section, the reader is also 
asked to derive more explicit expressions for the drifts and volatilities of the forward price 
under various choices of numeraire. 


Definition 1.22. Futures Contract. Futures contracts are characterized by an underlying 
asset of price process A, and a maturity T. Let us partition the lifetime interval [0,T] in N 
subintervals of length ôt = Z, Let t; =i- ôt be the endpoints of the intervals. The futures 
contract with reset period ôt is characterized by a futures price F*(A, T) foralli=0,...,N, 
and at all times t; the following cash flow occurs at time t;,,;: 


Chiat 


=F} (A, T) — F(A, T). (1.326) 


Furthermore, the futures price at time ty = T equals the asset price F}(A, T) = Ar, while 
at previous times the futures price is set in such a way that the present value of the futures 
contract is zero. 
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Recall that under the risk-neutral measure Q(B), the price of risk is zero (i.e., the numeraire 
g, is the money-market account B, with zero volatility with respect to all risk factors — 
o$ = 0). Hence, according to equation (1.291) of the asset pricing theorem, all asset prices 
A, drift at the riskless rate u“ = r, under Q(B): 


dA, = A a 
ae dt+) o4, dW, (1.327) 
t a=1 
where we have assumed M risk factors or, in the case of one risk factor, we simply have 


dA, 





=r, dt+o4 dW,. (1.328) 


t 


Proposition. Jn the limit as t — 0, futures prices behave as (zero-drift) martingales under 
the risk-neutral measure. 


Proof. By definition, the futures price is such that the present value of a futures contract is 
zero at all reset times t, and the cash flows at the subsequent times t+ ôt are given by the 
random variable ôF*¥ (A, T) = F7,5,(A, T) — F7(A, T), so the following condition holds under 
the risk-neutral measure: 


OF*(A, T 
R = J =0, (1.329) 
t+ôt 


where we discount at times t+ ôt. Taking the limit ôt > 0, gives B Eo? gF*(A, T)|=0. 
Since B, 4 0, the stochastic differential dF*(A, T) has zero-drift terms for all t; i.e., F*(A, T) 
is a martingale under the measure Q(B), with EP [dF*(A,T)]=0. O 


The price spread between futures and forwards is given by 


A, 
T (1.330) 


F} (A, T)— F(A, T) = E?” [A;]- 
with F} (A, T) = F;(A, T) (i.e., at maturity the two prices are the same). In Chapter 2 we shall 
derive a formula for this spread based on a simple diffusion model for the asset and discount 
bond. The topic of stochastic interest rates and bond pricing will be covered in Chapter 2. 
However, we note here that when interest rates are deterministic (nonstochastic), where r, is a 
known ordinary function of t, then the discount bond price is simply given by a time integral: 
Z,(T) = exp(— s r, ds) = B,/B,. When interest rates are stochastic (i.e., nondeterministic), 
as is more generally the case, then we can use equation (1.294) of the asset-pricing theorem, 
for the case Z,(T) as asset, to express the discount bond price as an expectation of the payoff 
Z,(T) = 1 (i.e., the payout of exactly one dollar for certain at maturity) under the measure 
with the money-market account as numeraire: 


Z (T) = ELP [ei SZT) = ELP [ee 45], (1.331) 
[This expectation is not a simple integral (as arises in the pricing of European options) and 


can in fact generally be expressed as a multidimensional path integral. See, for example, 
the project on interest rate trees in Part II.] In the case that the interest rate process is a 
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deterministic function of time or, more generally, when the underlying asset price process 
A, is statistically independent of the interest rate process (where both processes may be 
nondeterminsitic), then forward and future prices coincide and the spread vanishes. In fact, 
in this case 


7 ELP [e7 ii reds) 
Z,(T) 





EPO fe he nAtAg] A; (1.332) 


Q(B) 
Er lar] Z,(T) ZT) 


E? [A] = 


where we have used equations (1.331) and (1.294). 


Definition 1.23. European-Style Futures Options. European-style futures options are con- 
tracts with a payoff function @(A,) at maturity T. They are similar to the regular earlier 
European-style option contracts, except those are written on the underlying and traded over 
the counter with upfront payment, while futures options are traded using a margin account 
mechanism similar to that of futures contracts. Namely, futures options are traded in terms 
of a futures option price Až that equals (Ar) at maturity t = T, while the associated cash 
flow stream to the holder’s margin account is given by 


c, = AY — At ,,. (1.333) 


Notice that, similar to an ordinary futures contract, futures option prices A? follow 
martingale processes under the risk-neutral measure. 


Example 1. European Futures Options. 


The futures option price V* for a European-style option with payoff function $(A,) is thus 
given by the martingale condition 


Vz = E?” [6(Az)]. (1.334) 


The analogue of the Black-Scholes (i.e., lognormal) model can be written as follows under 
the risk-neutral measure 


dA* = oA; dW,, (1.335) 
where the drift is zero because of the martingale property. We remark here that, in case 
interest rates are stochastic, the implied Black-Scholes volatility on the futures option does not 
necessarily coincide with the implied Black-Scholes volatility for plain vanilla equivalents. 


Let A* = F*(A, T) be the futures price on the asset. At maturity we have A, = F7(A, T) = 
F,(A, T). The pricing formula for a futures call option struck at the futures price K is given by 


Ci(K, T) = EP [(Ay — K),] = F} (A, T)N(d,) — KN(d_), (1.336) 


where 





Wigs log(F} (A, T)/K) + (0° /2)(T — 1) (1.337) 
oVvT-t 








and we have used the standard expectation formula in equation (1.169) for the case of zero 
drift, and where the underlying variable S, in that formula is now replaced by F*(A, T). 
Notice that this formula carries no explicit dependence on interest rates. 
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Example 2. Variance Swaps. 


An example of a dynamic trading strategy involving futures contracts and the static hedging 
strategies discussed in Section 1.8 is provided by variance swaps. Variance swaps are defined 
as contracts yielding the pay-off at maturity time T: 


1 $ 2 2 
wf o? dt—> |: (1.338) 


where NV is a fixed notional amount in dollars per annualized variance. Assuming that 
technical upfront fees are negligible, variance swaps are priced by specifying the variance $°, 
which, as we show, is computed in such a way that the value of the variance swap contract 
is zero at contract inception (t = 0); i.e., since this is structured as a forward contract, it 
must have zero initial cost. Computing the expectation of the pay-off at initial time t = 0 
and setting this to zero therefore gives the fair value of this variance in terms of a stochastic 
integral: 


1 T 
Ya al Í o? arl, (1.339) 


We shall compute this expectation by recasting the integrand as follows. Assuming a diffusion 
process for futures prices and assuming that European call and put options of all strikes and 
maturity T are available, such a contract can be replicated exactly.!° 

More precisely, assume that futures prices F* = F*(A, T) on a contract maturing at time T 
with underlying asset price A, (e.g., a stock price) at time ¢ obeys the following zero-drift 
process under the risk-neutral measure Q(B): 


dF* 
F* 


t 





=a, dW,, (1.340) 


where the volatility ø, is a random process that can generally depend on time as well as on 
other stochastic variables. 

Then consider the dynamic trading strategy, whereby at time ¢ one holds ra futures 
contracts. If one starts implementing the strategy at time t = 0 and accumulates all the 
gains and losses from the futures position into a money-market account, then the worth II, 
accumulated at time T is 


T dF* 


Il, = = 
7 o a 





T 
f a, dW,. (1.341) 
0 


Due to It6’s lemma we have 





dF* 1(dF*\?  dF* 2 
dlog F* = — ( i) = ed 


= tdt, (1.342) 
Fe 2\ Fe F 2 


t 


and integrating from time t = 0 to T we find 


T 1 T 1 T 
log F} —log Fj = f o, dw,—> | o d=; -5 f o dt, (1.343) 


19We point out that in actuality the price of a variance swap is largely model independent. That is, it is possible 
to replicate the cash flows as long as the trader can set up a static hedge and trade futures on the underlying. 


76 CHAPTER 1 . Pricing theory 


where equation (1.340) has been used. Rearranging this equation gives the integrand in 
equation (1.339) as 


Fr 


Z, 1.344 
F (1.344) 


AN a | eee 
Tlie on ee pe 
This last expression demonstrates the precise nature of the replication. This contains (i) a 
static part given by the logarithmic payoff function and (ii) a dynamic part given by the 
stochastic time integral II. Substituting this last expression into equation (1.339) and using 
the fact that TI, is a martingale,” i.e., E2[I1,] = 0, we obtain 


2 F* 
2 = = p72) log = |. 1.34 
> TO og Fe (1.345) 


Replicating the logarithmic payoff function in terms of standard call and/or put pay-offs of 
various strikes using the replication schemes described in Section 1.8 then gives a formula for 
>? in terms of futures calls and/or puts. In particular, by applying replication equation (1.248) 
on the domain F% € (0, 00) and taking expectations, equation (1.345) takes the form (see 
Problem 2) 





od Fy F coer dK ieee dK 
5 = s lor z= + f PUK TS S, cK | (1.346) 
with any choice of nonzero parameter F € (0, 00), and where C*(K, T) and P*(K, T) represent 
the current t = 0 prices of a futures call and put option, respectively, at strike K and maturity T. 
Note that this formula holds irrespective of what particular assumed form for the volatility o,. 
In the cases of analytically solvable diffusion models, such as some classes of state-dependent 
models studied in Chapter 3, the call and put options can be expressed in closed analytical 
form. Of course, if o, = a(t), i.e., a deterministic function of only time, then the futures price 
obeys a geometric Brownian motion, and in this case, according to our previous analysis, 
we have simple analytical expressions of the Black-Scholes type, with C*(K, T) given by 
equation (1.336), and 


P*(K, T) = Ep [(K — F7),] = KN(-d_) — F} (A, T)N(—d,), (1.347) 





with d, given by equation (1.337), wherein o > ọ = Jr — t)! s a?(s)ds. For a numerical 
implementation of the efficient replication of logarithmic pay-offs for variance swaps in 
cases where only a select number of market call contracts is assumed available, the reader is 
encouraged to complete the project on variance swaps in Part II. 





Problems 
Problem 1. Derive the equations for the drift and volatility of the forward price as discussed 
in this section. For the domestic asset assume the process 
dA, 
A, 





=pdt+o4 dW,. (1.348) 


20Here we recall the property for the first moment Eo[ to fs dW,] =0, which is valid under a suitable measure 
and conditions on the adapted process f,. 
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Let Z,(T) be the price process of a domestic discount bond of maturity T. For any fixed 
maturity T > t, the discount bond price process is assumed to obey a stochastic differential 
equation of the form 
dZ(T) _ 
Z,(T) 





pdt+o7dw,, (1.349) 


where shorthand notation is used (u7 = ier = oF Py to denote the lognormal drift and 
volatility functions of the discount bond. Find the drift of the forward price process F,(A, T), 
defined by equation (1.322), within the following three different choices of numeraire asset 
g,: (i) the money-market account: g, = B, = elos4s where r, is the domestic short rate at 
time ż, (ii) the discount bond: g, = Z,(T), and (iii) the asset : g, = A,. Hint: Make use of the 
formula for the stochastic differential of a quotient of two processes that was derived in a 
previous section. 


Problem 2. Use equation (1.248) with payoff function (Ff) = — log 4, F = F}, S=F, 
= 0 
So = 0, S1 = œ, with 0 < F < œ, to show 


F Fa vt dK p” dK 
=1-2-log5+/ (K-t F-B 1.350 
Now, arrive at the formula in equation (1.346) by taking the expectation of this pay-off at 
t = 0 under the measure Q(B) while making use of the fact that an expectation can be taken 
inside any integral over K and the fact that ELP) [F}] = F*, i.e., that F* is a martingale within 
this measure. 


Pricing Formulas of the Black-Scholes Type 


In this section we apply the fundamental theorem of asset pricing of Section 1.10 to derive 
a few exact pricing formulas. The worked-out examples are meant to demonstrate the use of 
different numeraire assets for option pricing. 


Example 1. Plain European Call Option. 


As a first example, let’s revisit the problem of pricing the plain European call. Consider 
the Black-Scholes model (i.e., geometric Brownian motion) for a stock of constant volatility 
o and in an economy with a constant interest rate r. Under the risk-neutral measure with 
money-market account g, = B, = e” as numeraire, the expected return on the stock is just the 
risk-free rate r; hence, 


dS, = rS, dt +08, dW,. (1.351) 


The stock price process is given in terms of a standard normal random variable [i.e., equa- 


o2 
iomgain sA E OE eos. (0,1), Uda adaon (1-290), tiecaibitniee: 
free price at time ¢ of a European call option struck at K > 0 with maturity T > t is hence the 
discounted expectation under the risk-neutral measure Q(B): 


CS, K, T) =e EPO Sr- K), ] 


—r(T-t) oœ 2 go 

e a r- \(T-t)+oJT-tx 

- ež (sel z) Nata -K) dx 
V2 "0 + 


= S,N(d,)— Ke" N(d_), (1.352) 
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where 





_ log(S,/K) + (r+ 40°) (T-t) 


d4 
oVT-t 





(1.353) 





Note that the details of this integral expectation were presented in Section 1.6. 

This Black-Scholes pricing formula plays a particularly important role because it is the 
prototype for a large number of pricing formulas. As we shall see in a number of examples 
in this and the following chapters, analytically solvable pricing problems for European-style 
options often lead to pricing formulas of a similar structure. In the case that the underlying 
asset pays continuous dividends, the foregoing pricing formula for a European call (and the 
corresponding put) must be slightly modified. A similar derivation procedure also applies, as 
shown at the end of this section. 

If the drift and the volatility are deterministic functions of time, r = r(t) and a = o(t), 
the Black-Scholes formula extends thanks to the formula in equation (1.167) of Section 1.6. 
Using again the money-market account g, = B, = exp( h r(s)ds) as numeraire asset and setting 


1 T 
r(t, T)= —— | r(u)du 
(£T) To J, (u) 
gives B,/B, = e779, and we find 


C(S,, K, T) es e METY(T—) ROP) [ (sz = K),] 


—7(t,T)(T-1) no z ā(t,T)? Le TR 
e Í è (syel (47)-24 jer N+E, TTT -K) a 
N 2T —00 + 


= S,N(d,) —Ke™C- N(d_), (1.354) 





where 


1 
(T=¢) 





a(t, T) = Í E (1.355) 


and 





fp log(S,/K) + (F(t, T) + 4a(t, T)’) (T-t) 
j alt, TNT -t l 
Note that, in agreement with the results obtained in Section 1.6, the Black-Scholes pric- 


ing formula now involves the time-averaged interest rate and volatility over the maturity 
time T—t. 


(1.356) 








Example 2. A Currency Option. 
Let 


dX, = uxX, dt+oyX, dW, (1.357) 


be a model for the foreign exchange rate X, at time f, assuming that the lognormal volatility 
Oy of the exchange rate and drift ux are constants. Suppose that the domestic risk-free interest 
rate rf and the foreign interest rate rf are both constant, and let B? = e”' and BÍ = e" be 
the worth of the two money-market accounts, respectively. The drift wy can be computed as 
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follows. First we note that the foreign currency money-market account, after conversion into 
domestic currency, is a domestic asset and therefore must obey a price process of the form 


d(X,B!) = (r° + 0, 0xp7)(X,B/ dt + oygs(X, Bi )dW,, (1.358) 


where o, and oygy are lognormal volatilities of the numeraire g, and X,BÍ, respectively. 
We shall choose g, = B? (i.e., the domestic risk-neutral measure) giving 0, = 0. By direct 
application of It6’s lemma for the product of two processes we also have the stochastic 
differential 


d(X,B!) = X, dB! + BY dX,+ (dX,)(dB!) = X, dB! + BY dX,, (1.359) 


where the third term in the middle expression is of order dt dW, and hence set to zero. 
This follows since both domestic and foreign money-market accounts satisfy a deterministic 
differential equation, in particular, 


dB! = rf Bf dt. (1.360) 
Plugging this and equation (1.357) into equation (1.359) gives 
d(X,BÍ) = (r! + ux)(X,BÍ)dt+ oy (X, BY )dW,. (1.361) 


Hence, comparing equations (1.358) and (1.361) gives wy = rf — r’. The foreign exchange 
rate therefore follows a geometric Brownian motion with this constant drift and constant 
volatility oy. The pricing formula for a foreign exchange call option struck at exchange rate 
K is then 


—ri(T— BA 
C,(X,, K, T) = TEPO (X, — K),] 
—r4(T-1) no a 
= e / et (xe rf) yr thtoyVT *-K) dx 


ae . 
= e7 (t-1) [el X N(d,) — KN(d_)], 





= e” TX Md) -Ke Nd), Ges 


where 





PAR log(X,/K)+ (rf — rf + t0})(T — ) 





(1.363) 





OxVT-t 


Example 3. A Quanto Option. 


Consider the case of a quanto option, in which we have a stock denominated in a foreign 
currency with geometric Brownian process 


dSf = pS! dt+o,S! dW, (1.364) 
and the foreign exchange process is also a geometric Brownian motion, with 


dX, = (r° —rf)X, dt+oyX, dWŽ, (1.365) 
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under the risk-neutral measure with numeraire g, = B?. Note that the drift rate wy = r?—r/ 
was derived in the previous example. The constants a, and oy are the lognormal volatilities 
of the stock and foreign exchange rate, respectively. These Brownian increments are not 
independent; however, the foregoing equations can also be written equivalently in terms of 
two independent Brownian increments dW}, dW7, where 


dW* =p dW} +V/1-p? dW?, — dWS = aw. 
Here p is a correlation between the stock price and the foreign exchange rate at time t, with 
dw’ dW* =p dt. (1.366) 
In vector notation, dW, = (dW;, dW?) and 
dX 





ra =(r4—r')dt+oa,-dW,, (1.367) 
t 

ds! 

T udt+05:dW,, (1.368) 


t 


where oy = (pdy, Oyy 1 — p°), Os = (ds, 0). Suppose one wants to price a call option on 
the stock SÍ struck at K and then to convert this into domestic currency at a preassigned fixed 
rate X. Since g, = Bf, the prices of all domestic assets (as well as the prices of foreign assets 
denominated in domestic currency) drift at the domestic risk-free rate. Hence the return on the 
price process XS! must be r°. This also follows because the price of risk q8 = q” = Opa = 0. 
By direct application of It6’s lemma we also have 


d(X.S') dS! dX, ds! dx 
aE Ee ey (1.369) 
X,S! S Xx St x, 





Plugging the preceding expressions into this equation gives 


ee =(wt+r4—r! +oy-o5)dt+(oy+o5)-dW, 
= (p+r! — r’ +poyo,)dt+o,dWs + o,dw* (1.370) 
Since the drift must equal 7, 
u= rÍ —posoy (1.371) 


is the constant drift of SÍ in equation (1.364). The arbitrage-free price of a quanto call option 
struck at foreign price K is then given by 


C (S$, K, T)= Že T- EOP (St — RK) ] 


= Xe T ee T-9 St N(d,)— KN(d_)] 


Xe" ters- SI N(d) = eT KN(d_)], (1.372) 


where 





ee! log(S! /K) + (rf — posoy + $03) (T — t) 


j Chat 





(1.373) 
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Example 4. Elf-X Option (Equity-Linked Foreign Exchange Option). 
Assume equation (1.364), as in the previous example, and now write 
dX, = byX, dt+ayX, dw* (1.374) 


for the foreign exchange process, with jy dependent on the choice of numeraire. Consider 
the case where the pay-off is 


Cr = (Xr — K), Sf. (1.375) 


The foreign asset price SÍ cannot be used as a domestic numeraire asset, but the converted 
process g, = X,S! can. Indeed this is a positive price process denominated in domestic 
currency. Under the measure with g, as numeraire we first need to compute the drift wy 
explicitly. This is done by considering the process X,BÍ , which must drift at the domestic 


risk-free rate plus a price-of-risk component 
d(X,B} 

TAEI = (r° + Oxss + Type )dt t+ Oxgs dW, (1.376) 

pt 


where Oyst and O'xgr are volatility vectors of the price processes X „SÍ ‘and XB respectively. 
These are expressible in the basis of either (dW;, dW?) or (dW, dW), as described in the 
previous example. [Note also that the Brownian increments, written still as dW, in the SDE 
are actually w.r.t. the measure Q(XS‘).] From equation (1.370) we have Oys; = Oy + Gs. 
From a direct application of It6’s lemma we also have 
d(X,Bl)  ; 
———— = (r + py)dt+oay-dW,. 1.377 
X,B 7 ( Hx) x t ( ) 
By equating drifts and the volatility vectors in these two expressions we find O ygs = Oy and 
r? + (05 +0x): Oy =r + py. (1.378) 
Hence, 


By =r -rf +05: 0x+||ex]l. 
The drift wy-1 and volatility of the inverse exchange rate X7' under the same measure are 
computed using It6’s lemma [i.e., apply equation (1.138) with numerator = 1 and denomina- 
tor = X,]: 





dX7' 
Ia = (—py + ox)dt — oy dW. 
t 
Hence, 
By =—My tox =r — r’ — Og: Oy = rÍ — r’ — pods, 


where the square of the volatility is the same as that of X,, namely oy. Using the measure 
Q(XSf), we therefore have the arbitrage-free price: 


SÍ(Xr -K 
C,(Sf,X,, K, T) = (X, Sf) Eee Sr(Xr— K), 
t t = z 
TT 
l ss f 
= KX,S! ELS (K Poy | 
= KX,S{[K~'N(—d_) — e" X7'N(—d,)] 


2 s! [x,M(—d_) = e70 = +p0x95)T-) KN(—d_,)], (1.379) 
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where 





log(K/X,) + (rf — ri — +1 92)(T-t 
if SEE NEE is 1a 1) ai 
OxvT-t 








Let us now consider Black-Scholes pricing formulas as well as symmetry relations for 
European calls and puts under an economy whereby the underlying asset pays continuous 
dividends. This will be useful for the discussion on American options in Section 1.14. 
In particular, let us assume that the asset price S, follows geometric Brownian motion, as in 
Example 1, but with an additional drift term due to a constant dividend yield g: 


dS, = (r—4q)S, dt+ oS, dW,. (1.381) 
Note that from equation (1.165) we readily have the risk-neutral lognormal transition density 
for this asset price process, 
1 
OSV TT 


T = T-—t. We follow Example 1 and choose B, = e” as numeraire. Then, using equa- 
tion (1.169) with drift (r — q) as given by equation (1.381), the price of a European call struck 
at K with underlying asset paying continuous dividend q is 


p(Sp. 8,37) = elle Sr/S,)—(-9- Fe /2071 (1.382) 


C(S,, K, T) =e") EP (5, — K), ] 
=e "TO 97-98 N(d,.) — KN(d_)] 











=e 7-5 N(d,)— Ke” N(d_), (1.383) 
with 
= log(S,/K) + (r—q+4 40°) (T-9 (1.384) 
oȚNT-t 
The corresponding European put price is easily derived in similar fashion, giving 
P,(S,, K, T) = Ke N(—d_) — SeT N(—d,). (1.385) 


The previous put-call parity relation for plain European calls and puts, i.e., equation (1.214), 
is now modified to read 


C,(S,, K, T) — P,(S,, K, T) = e09 S, — Ke" (1.386) 


for generally nonzero q. 

This put-call parity is a rather general property that obtains whenever relative asset prices 
are martingales. Within the geometric Brownian motion model, we can further establish 
another special symmetry property that relates a call price to its corresponding put price. 
In particular, explicitly denoting the dependence on the interest rate r and dividend yield q, 
we have 


C,(S, K,T, r, q) = P,(K, S, T, q,r). (1.387) 


This relation states that the Black-Scholes pricing formula for a call, with spot S, = S, 
strike K, interest rate r, and dividend q, is the same as the Black-Scholes pricing formula 
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for a put where one inputs the strike as S, spot as S, = K, interest rate as q, and dividend 
as r. That is, by interchanging r and q and interchanging S and K, the call and put pricing 
formulas give the same price. For this reason we can also refer to identity (1.387) as a put-call 
reversal symmetry. This result can be established by relating expectations under different 
numeraires as follows. Consider the modified asset price process defined by S, = eS, then 
Itô’s lemma gives 


dS,=rS, dt+oS, dW, (1.388) 


within the risk-neutral measure. By alternatively choosing g, = Š, as numeraire, equa- 
tion (1.292) gives the arbitrage-free price of the call as 


C,(S, K, T; r, q) = SKe ELO (K — X,)4] (1.389) 


where we have used the spot value S, = S and defined the process X, = S7'. From equa- 
tion (1.388), we see that the lognormal volatility of g, (or the price of risk) is o; therefore, 
under the new measure, Q(g), equation (1.381) becomes 


dS, =(r—q+0°)S, dt+oS, dW,, (1.390) 


where dW, denotes the Brownian increment under measure Q(&). Using this equation and 
applying It’s lemma to X, = S7! gives 


dX, =(q—n)X, dt—oX, dW, (1.391) 


Under Q(g), the transition density p for the process X, is hence given by equation (1.382) 
with r and q interchanged and the replacement S, > X,, Sp > Xr: 


D(X, X30) = 1 tlos(x%r/X,)-G-r— 2 )rP 202 (1.392) 


OX,V20T 


Under Q(g), the drift of the lognormal diffusion X, is g—r. Using equations (1.171) and 
(1.392) with spot X, = 1/S, = 1/S at current time ¢, the expectation in equation (1.389) is 
evaluated to give 


C,(S, K, T; r, q) = SKP,(1/S, 1/K, T; q,r). (1.393) 


This establishes the identity, which is actually equivalent to equation (1.387), as can be 
verified using equation (1.385). Finally, note that equation (1.387) is also verified by directly 
manipulating equation (1.385) or (1.383). 

A class of slightly more sophisticated options that can also be valued analytically within 
the Black-Scholes model are European-style compound options. Such contracts are options 
on an option. Examples are a call-on-a-call and a call-on-a-put. Such compound options 
are hence characterized by two expiration dates, T) and T,, and two strike values. Let us 
specifically consider a call-on-a-call option. This contract gives the holder the right (not the 
obligation) to purchase an underlying call option for a fixed strike price K, at calendar time T}. 
The underlying call is a call option on an asset or stock with strike K, and expiring at a later 
calendar time T, > T, — we denote its value by C7, (S;,, K2, T2), where S;, denotes the stock 
price at T}. Hence at time 7, this underlying call will be purchased (i.e., the compound call-on- 
a-call will be exercised at time T,) only if C7, (S7,, K2, T,) > K,. Let t denote current calendar 
time, t < T; < T, then the pay-off of the call-on-a-call at T, is (Cr, (Sr, Ky, To) — K,),. 
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Since C;, is a monotonically increasing function of S;,, this pay-off is nonzero only for values 
of Sz, above a (critical) value Sf defined as the unique solution to the (nonlinear) equation 
Cr, (St, Ky, T,) = K,. Hence (Cr (Sr, K2, T,) — K,), = Cr, (Sr, Ka, T) — K, for Sy, > St 
and zero otherwise. 

Denoting the value of the call-on-a-call option by V% (S, t), where S, = S is the spot 
at time f, and assuming a constant interest rate with g, = e™ as numeraire asset price, we 
generally have 


VES) = eT EPO (Cr (Sro Ka Ta) = Ry) |: (1.394) 
Specializing to the case where the stock price process obeys equation (1.381) within the risk- 
neutral measure Q(B), this expectation is readily evaluated in terms of the standard univariate 


cumulative normal and bivariate cumulative normal functions. Inserting the price of the call 
from equation (1.383) gives 


V“ (S, t) = e™ f [eS N(d,.) — Kye" M(d_) —K,| p(S,, S; 7))dS,, (1.395) 
St 


Ti = T, —t, where p is the transition density function defined in equation (1.382) and 





ive log(S,/K,) + (r—q+3o’) (T,—T,) 
ee o/T,—T, 


Equation (1.395) is a sum of three integrals. The third integral term involves the risk-neutral 
probability that the stock price is above Sj} after a time 7, and having initiated at S. This 
integral is reduced to a standard cumulative normal function by changing the integration 
variable to x = log S;: 





(1.396) 





f P(S,, S; 7,)dS, = N(a_), (1.397) 
Sj 


where we define 





nieces te 
a a WRlS/SH) + (r= 4 307) 1 (1.398) 
+ oft 


The second integral term in equation (1.395) can be rewritten using 








Nd_)=f Pla Si; T,-T1)dS;, (1.399) 
Ky 
giving 
Í N(d_)p(S,, S; maS, = f f P(S2, Si; Ta —T;) p(S,, S; 7,)dS5 dS,. (1.400) 
Ss St “Ky 


This double integral can be recast in terms of a standard bivariate cumulative normal function 


2 29 
5 ee ay dx, (1.401) 


elf ee| 

— exp 

2T, /1=— p? 1- Jc 2(1— p°) 

where p is a correlation coefficient. For this purpose it proves useful to define 


T= T,-t and PENTI T, (1.402) 


N, (a, b; p) = 
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hence T, — T; = T, — T,. Introducing the change of variables 





oe log(S,/S) — (r— q- io) Ti _ log(S,/S) — (r— q— io”) T3 
O/T va o/h 


Equation (1.400) then becomes (after some algebraic manipulation) 


oo a b 
i = 1 are as _ x _ (=px)? 
| N(d_)p(S,, S; 7))dS, = ol. J. exp | 2 sith |ay dx, 
= N,(a_,b_; p), (1.403) 





where 








b, = OSK g E % , (1.404) 
TJT 


We leave it to the reader to verify that the first integral term in equation (1.395) can be 
reduced, using similar manipulations as earlier, to give 





f S,N(d,)p(S,, S; 7))dS, = Se"-9" N,(a,, b}; p). (1.405) 
si 


Combining the three integrals in equation (1.395) finally gives 
V“ (S, t) = SeN, (a}, b}; p)— Ke '? N,(a_,b_3 p)—K,e ™N(a_). (1.406) 


Derivations of similar pricing formulas for related types of compound options are left to the 
interested reader (see Problem 10). 


Problems 


Within the problems involving a single underlying asset or stock, assume we are in a Black- 
Scholes world where the asset price process obeys geometric Brownian motion of the form 


dA, =(r+q:04)A, dt+ 0,44, dW,, (1.407) 


where dW, is the assumed Brownian increment under the given measure, the interest rate r 
(in the appropriate economy) and volatility @, are constants, and q is a market price of risk. 


Problem 1. Find the price of a call option on foreign stock struck in foreign currency, i.e., 
of the contract with payoff 


Cr = X,(Si—K),. (1.408) 


Problem 2. Find the price of a call option on foreign stock struck in domestic currency 
with payoff 


Cr = (XrS%-— K),, (1.409) 


where X, is the exchange rate at time t. 
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Problem 3. Consider again the example of the quanto option in Example 3. Compute the 
coefficient œ in such a way that the price process 


g, =Xe"s! (1.410) 


is a domestic asset price process. Further, price the quanto option in Example 3 using g, as a 
numeraire asset. Describe the replication strategy for the numeraire asset g,. 


Problem 4. Derive the price of an Elf-X option from the point of view of the foreign investor 
taking as payoff 


Cr = (St— KSHY,),, (1.411) 


where Y; = 1/X,. 


Problem 5. A forward starting call on a stock S is structured as follows. The holder will 
receive at a preassigned future time 7; a call struck at K = aS;, and maturing at time T, > T}. 
Here, @ is a positive preassigned constant and S7, is the stock price realized at time T). 

Find (i) the present time t = tọ < T, price of the forward starting call prior to maturity T, 
and (ii) a static hedging strategy that applies up to time 7,. Using the result in (i), show that 
the price of the contract simplifies to that of a standard call struck at K = aS, with time to 
maturity T, — tọ in the limiting case that T, > tọ (with tọ, T, held fixed). On the other hand, 
show that in the limit T, —> T, (with tọ, T, held fixed) the contract price is simply given by 
So(1 — @),. This last result is consistent with the price of a standard call with maturity t = T, 
and strike K = aS,,. 


Problem 6. Consider two stocks S! and S? described by correlated geometric Brownian 
motion with constant volatilities o, and o, and with correlation p. As seen in Section 1.6, a 
simple chooser option yields the pay-off as the maximum of the two stock levels, 


max(S}, Sz), (1.412) 


at the maturity date T. Find the price of this instrument at time t < T. Find the relationship 
between the price of this chooser option and that of the chooser with payoff (S7— S;),.. 

One Solution: To solve for either option price, pick the price of stock 1 as numeraire, 
g, = S}. So, for instance, to price the latter option, show that the price C, is given by an 
expectation 


C, = SE? (fr—1), ], (1.413) 


where the random variable f, = S?/S; obeys 
d 
E =e AW Ga Tape ae (1.414) 
Jt 


From this, show that we have 


2 
A- ET-)+ p-o) WW) +o, TPE We), (1415) 


t 


log 


where W; are independent Wiener processes at time t and 





v= 0? +03 —2p0,0, =|o,|. 
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Since log(f;/f,) is normally distributed, find its mean and variance and thereby obtain the 
lognormal drift and volatility of f,, i.e., the lognormal density p = p(fr, f,; T — t), giving 
the price 


C, = S’N(d,,.) — S!N(d_), (1.416) 


where 








Fe log(S?/S/) + +v°(T — 1) 


n SG (1.417) 





Problem 7. Derive the standard call option-pricing formula of Example 1 of this section, 
but this time use the stock price as numeraire, i.e., g, = S,. In particular, show that with this 
choice of numeraire, 


dQ/S,) _ 
(1/S,) 





r dt—o dW,, (1.418) 


where dW, stands for the Brownian increment under the measure Q(S) with S, as numeraire. 
Then show that this leads to 


C,(S,,K, T) = KS,EP [(1/K —1/S;), ]. (1.419) 


Note: This is related to the price of an European put contract where the random variable 
is now the inverse of the stock price struck at the inverse of the strike, i.e., 1/K, and with 
drift = —r. Compute this expectation to obtain the final expression. 


Problem 8. Consider a foreign money-market account BÍ = ela rds (with interest rate in 
foreign currency given by rÍ at time f), a domestic asset with price A4, and a foreign asset with 
price Af. Let X, be the exchange rate process in converting foreign currency into domestic. 
Suppose we choose g, = Af as our numeraire asset. Compute the drift of the following 
processes: X,, B, and Af, within the Q(g) measure. 


Problem 9. Consider a domestic asset with price A? and a foreign asset with price Af. Let 
the constant x be the conversion factor 

d 
= Ap 


c= 2, (1.420) 
Aj 


Note that this is given in terms of the asset prices at some current time t = 0. 
8 P. 
(i) Find a pricing formula for the contract at current time t = 0 with payoff function 


max(A‘4, KAŻ) (1.421) 


at maturity t = T. Assume that all relevant lognormal volatilities and correlations are constant. 
(ii) How can one hedge this contract? Is it necessary to trade the foreign currency 
dynamically? 


Problem 10. Derive pricing formulas analogous to equation (1.406) for (i) a call-on-a-put, 
(ii) a put-on-a-put, and (iii) a put-on-a-call. 
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1.13 Partial Differential Equations for Pricing Functions 
and Kernels 


Consider the continuous-time model with state-dependent volatility 


= = (r(t) +qo(S,, t))dt + 0(S,, )dW,, (1.422) 


t 


where q is the price of risk (also equal to the volatility of the numeraire asset). Here, r(t) is 
a deterministic, time-dependent short rate consistent with the term structure of interest rates. 
The state-dependent volatility a(S, t) is sometimes called the local volatility. 

The asset price process A, of an European-style option contingent on the asset S in the 
model described by equation (1.422) is given by a pricing function A(S,t) through a formula 
of the form 


A, = A(S,, t). (1.423) 


The existence of a pricing function is an expression of the fact that the current price of an 
European option depends only on current calendar time and on the current (i.e., spot) price 
S = S, for the underlying asset (assuming all other contract parameters are held fixed as the 
maturity time, etc.). 


Theorem. (Black-Scholes Equation) The pricing function A(S, t) of a European claim 
contingent on the asset S in equation (1.422) satisfies the Black-Scholes equation 
ðA $ o? S? PA p 0A 
a 2 a> as 





rA=0, (1.424) 


where r = r(t), o = a(S, t). 
This is a backward time parabolic partial differential equation related closely to the 
backward Kolmogorov equation, as we shall see later. 


Proof. Choosing as numeraire asset the money-market account B, = elo "4s, the price of risk 
q = 0 and the risk-neutral pricing formula yields 


E°| dA,| = r(t)A, dt (1.425) 


Equation (1.424) follows by applying Itô’s lemma to the calculation of dA, = dA(S,, t). 
Namely, 


Be gee a (1.426) 
m as oe oss 





r = r(t), o = o(S,t). Lastly, note that this follows simply from the Feynman—Kac 
theorem. O 


A second important partial differential equation concerns the probability density function 
P(S, t) under the risk-neutral measure for the stock price values S at time ¢, given an initial 
Dirac delta function distribution at time t = tọ: 


P(S, t= t) 2305s (1.427) 
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More explicitly, this function is given by P(S, t) = p(S, t, So, to); i.e., this is the risk-neutral 
transition probability density for the price of the underlying asset to begin at value Sọ 
at initial time fọ and end with value S, = S at time t. The function p(S, t; So, fo) is also 
commonly referred to as a pricing kernel. We have already seen a specific example of this 
as the lognormal transition density for geometric Brownian motion. In general, the resulting 
equation, called the Fokker—Planck (or forward Kolmogorov) equation, is contained in the 
following statement. 


Theorem 1.7. (Fokker—Planck Equation) The probability density function P(S, t) under the 
risk-neutral measure for the stock price values S at time t satisfying initial condition (1.427) 
obeys the following equation: 

ôP 1# 


a r (SP), (1.428) 


where r = r(t), o = 0(S, t). 
Proof. This result can be derived as a consequence of the Black-Scholes equation. Consider 


a generic asset with pricing function A(S, t) defined in the interval t € (tọ, T), we then have 
from risk-neutral valuation that at any time f, 


AS tp) = e OS f P(S, t)A(S, t)dS. (1.429) 
0 


Note here that we assume that the range of solution is S € (0, 00), although the derivation 
can be extended to cases with different ranges. Taking the partial derivative with respect to 
calendar time f on both sides of this equation, we find 


œ% ôP dA SoA 
f —rPA+A— +P|rA-rsS dS =0, 
A at as. aS? 





where r = r(t), o = a(S, t), and the Black-Scholes equation (1.424) has been used for i, 
Integrating the last two terms by parts we obtain: 








% ðA S cgo g % g 
-f ps“as=-(PSA AČ (SP)dS= | AČ (SP)dS, 
g aas ( J +f as P) J ag SP) 
and 
oS oe A OEP © JA ð (0?S2P 
i BS age © as sal 2 Jas 





1 0A > 1 p> 
=—3(=, A) (PSP) -=| A-z (0S P)dS 
aga “(0 | z u ) 


E 2 92 
Ti ae (o?2S2P)dS. 


In the last equation we have integrated by parts twice. Notice that the nonintegral terms all 
vanish, due to the boundary conditions on the probability density function P, namely, that 
the function P and the first and second partial derivatives with respect to S, are assumed to 
be rapidly decaying functions of S as S —> 0 and S — oo. Collecting terms gives that for any 
derivative pricing function A(S,t), 





æ ðP ð 
A(S, SP °S°P) |dS = 0. 
I ( SE r ) we | 
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This can only occur if the integrand term in brackets is identically zero; hence equation (1.428) 
is fulfilled. O 


The corresponding backward Kolmogorov equation for the density is given by the so- 
called Lagrange adjoint of equation (1.428). By combining equations (1.429) [with P(S, t) = 
P= p(S, t; So, to) | and (1.424), we readily see that e~ Jard P must satisfy the same equation 
as A(So, tọ) for all initial times ty < t. Simplifying the equation in terms of P only, we find 
the backward Kolmogorov equation: 


oP 
ks 5 25, 4)52o 


aP 
ih ae Ito) Sexe = 0. (1.430) 


ðs? 
This is a backward-time parabolic partial differential equation of the form of the Black- 
Scholes equation [i.e., replacing (S,t) by (Sp, tọ) in equation (1.424)]. The only term missing 
is the compounding term r(f))A. However, as just mentioned, the function e` fig"45 P does 
exactly satisfy the Black-Scholes equation. This is, not surprisingly, consistent with our 
discussion in Section 1.8, where we showed [see equation (1.231)] that the discounted 
transition density gives the current price of a European butterfly option with inifinitely narrow 
spread (i.e., the price of an Arrow—Debreu security). 

A partial differential equation satisfied by the pricing function of European-style call 
options C(S, t; K, T) regarded explicitly as functions of the strike and maturity time arguments 
(K,T) [instead of functions of the arguments (S, t), which are held fixed] can now be derived 
as follows. 


Theorem 1.8. (Dual Black-Scholes Equation) The pricing function for a European call 
option C(S, t; K, T) satisfies the following equation: 


C 


aC ôC 1 
= —r(T)K K?’ o’ (K, T)—. 1.431 
r(T) Ta a (K, Ere (1.431) 


oT 





Proof. European-style call prices admit the following representation in terms of the risk- 
neutral transition probability density [i.e., the density for the risk-neutral measure Q(B)]: 


C(K, T) = Z)(T) Eo (S — K),] = Z)(T) | 7 P(S, T)(S— K), dS, (1.432) 


where Z,(T) = e fo 4ds. Without loss of generality we simply set current time t = 0. Using 
the property 0(S — K),/dK = —0(S — K), where 0(x) is the Heaviside step function with 
value 1 for x > 0 and value 0 for x < 0, the first and second derivatives of equation (1.432) 
with respect to the strike K give 
dC % 
A f P(S, T)dS, (1.433) 
OK K 
and 


2 
Ç 

T Zo(T)P(K, T). (1.434) 

The derivative with respect to maturity is given by 


a = n ea ea á (S—K), dS 





=en -e 1r 


(o sep) is- K), dS, 
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where r = r(T), o = a(S, T). Note that we have used equation (1.428) with t = T. The 
integral containing the first derivative with respect to S can be evaluated by parts as follows: 


fore) 0 fore) 
S— K), —(SP)d S = — SP dS 
[ 6-0, (SP) [ 


=-f (S-K),P dS-K f P ds 
0 K 
ac 
=|-—C+K— |Z,(T)", 
| c+ RE jan 


where we used the identity S = (S— K), +K, for S € [K, œ), and equations (1.432) 
and (1.433). The integral containing the second derivative can again be evaluated by parts: 


f * (SK), (0S P)dS=— f ” © CoS? P)dsS 
0 * as? K ôS 


=0°(K,T)K’P(K,T) 
PC 

= Z,(T)'07(K, T)K? —. 
o(T) a (K, T) JK? 


Collecting the intermediate results obtained so far, we arrive at the following dual Black- 
Scholes equation: 


ac ac 1 2C 
— = -rC + rC — rK — + -Ko (K, T) — 
m TORCA p oe ED 

ac 1 eC 

= —rK — + —K’o’(K, T)—. 

Rag tak ok Tae 


O 


A consequence of this result is the following, which may be used in practice to calibrate 
a local volatility surface 0, = o(K, T) via market European call option prices across a range 
of maturities and strikes. 


Theorem 1.9. (Derman-Kani) /f a local volatility function exists, then it is unique and it 
can be expressed in analytical closed form as follows in terms of call option prices: 
2 £4 rKE 
2 _ ar aK 
ONE a alae.” 

aK? 


(1.435) 


This PDE pricing formalism extends readily into arbitrary dimensions. A general con- 
nection between a system of SDEs and the corresponding forward (backward) Kolmogorov 
PDEs that govern the transition probability density is as follows. Consider a diffusion model 





with n correlated random processes x, = (x}, . . . , x”) € R” satisfying the system of SDEs: 
dx, < 8 
1 = w,(x,, dt + >> C; a(x., dW? Ps ens (1.436) 
x; a=1 f 


with M > 1 independent Brownian motions, dW* d we = bap dt, and where the drifts and 
volatilites are generally functions of time ¢ and x,. Let us define the differential operator £ by 


of 


OX;OX; í 





n ð 1 n 
Lyf — Yo xm (x, noe + 2 > xix ;v; (X, t) (1.437) 
i=1 i i,j=l 
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with Lagrange adjoint operator £* given by 





K A[ xu; (x, df] 1 a| xix ;v; (x, tf | 
L f= — a ; 1.438 
xf 2 Ox; ES 2 x ax; Ox; ( ) 
where the functions v; j i,j=1,...,n, are defined by 
M 
v; (x,t) = Y Tak, t)T; a(X, t). (1.439) 
a=1 


These operators act on any sufficiently differentiable function f = f(x, t). The transition 
probability density p = p(X, t; Xp, fọ) associated with the foregoing diffusion process then 
satisfies the forward (Fokker—Planck) Kolmogorov PDE, 


dp 
ra Li ip (1.440) 
as well as the corresponding backward PDE, 
dp 
az, t Prose? = 0; (1.441) 
0 


for all t) < t, with initial (or final) time condition 
D(X, t = to; Xo, to) = P(X, t; Xo, to = t) = 6(K—Xp). 


Assuming that a diffusion path starting at some point xX, at time f) and ending at a point x at 
time f must be at all possible points X at any intermediate time f, tọ < t < t, then a consistency 
requirement in the theory is the so-called Chapman—Kolmogorov integral equation: 


PX, tXo, b) = Í D(X, £3, 7) p(X, F Xo, ty) dX. (1.442) 
R” 


Prices of European-style contingent claims can then be computed by taking integrals over 
an appropriate pricing kernel as follows. Suppose we are within a certain measure Q(g) where 
underlying assets depend on random variables x‘ that have appropriate drift and volatilites 
in accordance with equation (1.436). Assuming the existence of a martingale measure where 
the numeraire is, for example, of the form elo AC%s.s)ds (i.e., with A as a discounting function), 
then according to the asset pricing theorem of the previous section, the price of a contingent 
claim A(x, t) with payoff @(x) is given by the expectation 


A(x, t) = EL [eI Mdi h(x)]. (1.443) 


Then due to the Feynman—Kac formula (in n dimensions) we have the corresponding Black- 
Scholes PDE: 


A(x, t) 
ôt 





+L, A(X, t) — A(x, A(x, t) = 0, (1.444) 


t < T, with terminal condition A(x, T) = (x), as required. From this analysis we see 
that the price of the contingent claim satisfying this Black-Scholes type of PDE can in 
fact be expressed as an integral over the set of diffusion paths. With the particular choice 


1.14 


1.14.1 
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A(x,, t) = r(t) (the risk-free rate), then, the density p is the risk-neutral density expressed in 
the x-space variables. The claim’s price is then simply given by an integral in R”: 


AK, 1) = 94 f yxy, T x, Nb (Kr) dr. (1.445) 


This is a multidimensional extension of equation (1.429). Note also that here, variables x do 
not necessarily represent prices. In general, asset prices are functions of x and time t. A nice 
feature of such integral equations, among others, is the fact that they provide a solution 
whereby the kernel p and hence the expected values can be propagated forward in the time 
variable T, starting from T = t, where the delta function condition is employed. 


Problems 


Problem 1. Consider the one-dimensional lognormal density p(S, So; t— tọ) given by equa- 
tion (1.165). Show that it satisfies forward and backward equations of the form (1.428) 
and (1.430) as well as the Chapman—Kolmogorov equation, 


f P(S, 5:t- DP, Sp; 1— to) dS = p(S, Soi t— h), (1.446) 
0 


ip St Kt. 


Problem 2. Consider the n-dimensional lognormal density given by equation (1.198). Verify 
that this density satisfies the appropriate Kolmogorov equations. 


American Options 


In this section we briefly present the theory for pricing American, or early-exercise, options. 
The distinction between an American-style option and its European counterpart is that the 
holder of the American option has the additional freedom or right to exercise the option at any 
date from contract inception until expiration. This additional time optionality generally gives 
rise to an additional worth, appropriately also referred to as the early-exercise premium. We 
mainly focus our discussion on calls and puts, although the theory is also useful for treating 
other types of pay-offs. Throughout this section, we shall assume that we are within a Black— 
Scholes world with only one underlying asset. Although the formal theory readily extends into 
the multiasset case, the practical implementation and analysis issues are nontrivial and not 
within the scope of our present discussion. The development of numerical methods for pricing 
multiasset American options remains a topic of active research (see, for example, [BD96, 
BG97b, BG97a, BKTO1, Gla04]). 


Arbitrage-Free Pricing and Optimal Stopping Time Formulation 


To begin our discussion, we consider the case where the underlying asset (or stock) price 
process (S,),.9 follows the geometric Brownian motion model as given by equation (1.381) 
in the risk-neutral measure, where r is the risk-free interest rate and q is a continuous 
dividend yield. We therefore assume that r > 0, q > 0, o are constants (i.e., state and time 
independent), although the formalism (i.e., the governing equations) readily extends to the 
case of state-dependent drift and volatility functions. Let tọ be the present time (i.e., contract 
inception). An American call (or put) option struck at K with expiration at time T is a claim 
to a payoff (S,— K), (or (K —S,),) that the holder can exercise at any intermediate time 
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t prior to maturity, i.e., tọ < t < T. The time at which the option is exercised is a stopping 
time. Recall the simpler situation in which the stopping time is initially known (i.e., as in 
the case of a European-style claim), then from the theorem of asset pricing the arbitrage-free 
price of a claim with a given pay-off occurring at time ¢ is simply given by the discounted 
expectation via equation (1.294). In particular, the value at present time fọ of a cash flow 
(S,—K), delivered at a later time ¢ is given by 


eV EO[(S,—K),], 


where E,[-] = EPlF,.] = E°.|S,, = Sp] is used as a simplified notation to denote the 
expectation at time f) within the risk-neutral measure Q(B), with B, = e” as numeraire, 
conditional on S,_,, = So. This expectation gives us the fair value of the cash flow as 
long as the delivery time ¢ is a given stopping time, which may either be deterministic or 
random. For the case in which the stopping time is given by the maturity, e.g., t = T, the 
foregoing expectation obviously corresponds to the price of an European call [as given by 
equation (1.383), with t, S, replaced by tọ, So]. 

For American contracts the holder has the freedom to exercise at any time within the 
continuous set of values J = {t : tọ < t < T}, giving rise to an optimal stopping time (i.e., 
early-exercise time) at which the holder should exercise the option for maximal gain. In 
particular, we shall see that an early-exercise boundary arises on the (t, S,)-plane (i.e., time- 
spot plane) that separates the domain [f), T] x R}, into two subdomains. These consist of 
a so-called continuation domain, for which the option is not yet exercised, and a stopping 
domain, whereby the option is exercised early. Hence, a main distinction from the European 
case is that the exercise time is not known prematurely and must be optimally determined as 
part of the solution to the pricing problem. As observed later, the basic financial reasoning 
for the emergence of an early-exercise boundary is that the holder can either claim a profit 
from the underlying dividend income by opting to purchase the asset (e.g., for the case of a 
call) or profit from the interest that arises from selling the underlying asset and investing the 
proceeds in a money-market account (e.g., for a put). 

More generally, let us consider a nonnegative payoff function #(S), S € R,. The values of 
the European and corresponding American claim to such a pay-off are given, respectively, by 


Ve(So, T — to) = Eole H(S7)] (1.447) 
and 
V(So, T — to) = sup Eo[ e A(S,)]. (1.448) 
teT 


Throughout this section we use Vg to distinguish the European price from its American 
counterpart. In equation (1.448) the supremum is taken over all possible stopping times in the 
set J. Note that both pricing functions are functions of the current time to maturity T — tọ, 
as is generally true when the drift and volatility terms have no explicit time dependence. 
We remark that although various theoretical frameworks exist for the determination of optimal 
stopping times, exact analytical formulas for such quantities as well as for American option 
values in terms of known transcendental functions have not been found to date. This is 
the case for the geometric Brownian motion model and, of course, for the more complex 
state-dependent models. In Section 1.14.4 we develop an integral-equations approach for 
computing the early-exercise boundary and the American option value, whereas in this section 
we provide a discrete-time backward induction formulation, which is useful for approximating 
the continuous-time quantities. 
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Formally, the optimal stopping time, denoted by ?¢*, is given by the infimum over the set 
T such that the value of the American option is equal to its intrinsic value (or face value) as 
given by the pay-off at the observed asset price: 


t = inf{t € T, V(S,, T — À = O(S,)}. (1.449) 


The stopping domain, corresponding to spot and time values for which it is optimal to exercise 
prematurely, consists of the set of points 


D={(t,S):tET, WS, T-t) = A(S)}, (1.450) 


while the continuation domain, corresponding to spot and time values for which the option 
is not exercised prematurely, is the set of points 


C={(t,S):tE€ T, WS, T — t) > $(S)}. (1.451) 


Assuming there exists an optimal stopping time ¢*, then from asset-pricing theory this time 
is given implicitly by 


Efe pS] = WSp, T — to). (1.452) 


This is a result that is not practical as it stands since the equation involves the American 
option value on the right-hand side, which is itself not yet known and dependent upon the 
stopping domain. This is a common feature among optimal stopping problems for Markov 
processes in continuous time, because they are essentially free-boundary value problems as 
shown shortly. 

The structure of the stopping domains may be quite complicated for certain classes of 
payoff functions and diffusion models. However, for standard piecewise call/put types of 
pay-offs considered here, the domains turn out to be simply connected. In particular, the 
boundary of D is an early-exercise boundary curve given by 


OD = {(T, 8S): 0< T <T- t, S = S*(7)}, (1.453) 
with S*(7) given by a smooth curve 

S*(7) = min{S > 0: V(S, 7) = (S— K),} (1.454) 
for a call and 

S* (T) = max{S > 0: V(S, T) = (K — S),} (1.455) 


for a put struck at K. Here the function V(S, 7) represents the value of the American call 
C(S, K, T) or put P(S, K, T), respectively, where S is the value of the underlying spot. From 
equation (1.451) it is obvious that the continuation domain is the set of all points (7, S) such 
that V(S, T) is greater than the respective payoff function at S. As we will see, the subscript 
+ signs are actually redundant in equations (1.454) and (1.455). Note that here we have 
simply expressed the boundary and the option price in terms of the time-to-maturity variable 
7 = T-—te€[0, T — tọ] rather than the calendar time t € [f), T]. This is convenient for what 
follows since the diffusion models are assumed to be time homogeneous. The optimal-exercise 
decision for the holder therefore depends on the observed spot (or stock price level) and the 
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time to maturity (or calendar time) of the observation. In this sense, Amercian options can 
be characterized as having a kind of path dependence. 

Before any further analysis, we make note of one very basic and important property of 
the early-exercise premium (or value): The European option value Vp satisfies the condition 
(i) V;(S, T) = &(S) for all (S, T) if and only if the corresponding American option value V 
satisfies (ii) V(S, T) = Vg(S, T) for all (S, T). That is, if the corresponding European price is 
always above its intrinsic value during the contract lifetime, then it is never optimal to exercise 
the American option at any time earlier than expiry; i.e., there is no early-exercise premium 
and V = Vp. To show this, note that equation (1.448) implies V(S, 7) > V,(S, 7). Hence 
condition (i) gives V(S, 7) > (S), so the American option is always above the intrinsic value, 
implying that the holder would not exercise earlier for a lower value. The optimal exercise 
(stopping) time is therefore at expiry T; hence (i) implies (ii). To prove the converse, observe 
that since the American option value must satisfy V(S, 7) > (S) for all (S, 7), condition (ii) 
implies (i). This result is essentially a statement of the fact that an early-exercise boundary 
(and premium) arises only if the corresponding European option value falls below the intrinsic 
(payoff function) value. Because of this we have the following rather well-known result. 


Proposition. 
(i) An Amercian call has a nonzero early-exercise premium if and only if q > 0. 
(ii) An Amercian put has a nonzero early-exercise premium if and only if r > 0. 


This result will be seen to follow explicitly from the early-exercise boundary properties 
and the formulas for the early-exercise premiums developed in the following subsections. 
However, a simple and instructive proof goes as follows. 


Proof. The put-call parity relation for European calls and puts gives 
C,(S, K, 7) —P,(S, K, T) =e “S—e-"FK. (1.456) 
Rewriting this we have 
C,(S, K, T) =S—K+P,(S, K,7)+[(e” — 1)S— (e™ — 1)K]. (1.457) 


Since P(S, K,T) > 0, then for q = 0 either of these expressions gives C,(S, K, T) > S— 
e`" K > S—K. Hence Cp is always above its intrinsic value, and from the previous property 
we conclude that the European call value is equal to the American call value, C,(S, K, 7) = 
C(S, K, T), so the early-exercise premium is zero. For the case q > 0, we use equation (1.457) 
and note that since the European put is a decreasing function of S, there exist large enough 
values of S > K such that P(S, K, 7) +[(e°” — 1)S — (e™™ — 1)K] < 0, i.e., Ce (S, K, T) < 
S—K for some S > K. From the previous result we therefore have C(S, K, 7) Æ C;,(S, K, 7) 
and hence conclude that the early-exercise premium is nonzero for q > 0. This proves (i), 
while statement (ii) is proved in a similar fashion by reversing the roles of S,q with K,r and 
is left as an exercise. O 


An obvious consequence of this proposition is that: (i) for an American call on a non- 
dividend-paying stock the exercise boundary is trivial (i.e., it is never optimal to exercise 
early), and (ii) for an American put on a nondividend-paying stock the exercise boundary is 
nontrivial (i.e., there is an optimal early-exercise time) if the interest rate is positive. In what 
follows (and also from the framework of Section 1.14.4) we will be able to further assess 
such properties. 
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Pricing by Recurrence: Dynamic Programming Approach 


We now consider specifically the recursive formulation for pricing American options. This 
involves an iteration method that goes backward in calendar time (or forward in time to 
maturity). Formally, the American option price is given by equation (1.448). In order to 
actually implement this formula in a practical manner, we subdivide the time interval [t), T] = 
lto; fis- - -ty = T] into N > 1 subintervals [¢,, tı], 6¢;=¢,,-—t,>0,1=0,...,N—1. 
For notational purposes it is useful to introduce the price function V,(S). For the case of 
time-homogeneous diffusions we have 


V,(S) = V(S, T — t) = V(S, 7), (1.458) 


with T = T — t being the time remaining to maturity. We therefore assume that exercise 
can only occur at a fixed set of (intermediate stopping) times given by {t; : i =0,..., N} 
Equation (1.448) can then be approximated by 


VS) = sup Eye" (S,)], (1.459) 


Vo (So) = Vp (So) = V(So, T — to). For small ôt; values we expect equation (1.459) to be a 
good approximation to equation (1.448). From the theory of optimal stopping rules, one can 
show that in the limit ôt; —> 0 (N — oo) this approximation approaches the exact American 
option value in equation (1.448), which allows for continuous-time exercise. We remark that 
equation (1.459) actually gives the exact price of a Bermudan option with payoff function @. 
Bermudans are bonafide contracts that essentially lie in between European and American 
contracts and are in reality structured specifically with only a fixed set of allowable exercise 
dates. Moreover, in any realistic trading strategy it is interesting to note that the actual 
information on asset price levels can only be accessible to the trader at intermittent times 
(i.e., at best one obtains “tick-by-tick”’data). Hence, for the holder of an American option 
the exercise decision times, although approaching the continuum limit, essentially occur at 
discretely spaced points in time. 

By discretizing time, the underlying asset price process with values $, € R,,i=0,...,N, 
is then a Markov chain. Iterating backward in calendar time starting from maturity, equa- 
tion (1.459) is readily shown to imply that the option price at any intermediate time satisfies 
the recurrence relation 


V, (S) = max {(S), E,,[e",,,, (8,15, = S]} (1.460) 
i=N-1,...,0, where V;(S) = (S). This result states that the option price at each date t; 


is given by the maximum of the pay-off (or the immediate-exercise value) and the discounted 
expected value of continuing without early exercise at time t;. Note that at each ith step the 
expectation is conditional on S, = S. [Remark: Equation (1.460) can also be rewritten as a 
forward recurrence relation in terms of a discretized time to maturity variable 7; = T — t; 
using equation (1.458)]. This formulation can be applied to asset prices that obey diffusion 
processes with generally state- and time-dependent drift and volatility functions. Here and 
in the following subsections, however, we are assuming time-homogeneous solutions; i.e., 
the drift and volatility functions of the asset price process are only allowed to be explicitly 
state dependent. Assuming a generally state-dependent Markov diffusion process (S,),.0, 
S, E€ R, with assumed risk-neutral transition probability density function p(S’, S; 7), the 
earlier expectation then gives 


V, (S) = max {&(5), ¥,(S)}. (1.461) 
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where 


(S’)dS' (1.462) 


POS [p(s S Va, 
represents the continuation value of the option at time f;. For the particular process, of 
equation (1.381), p is specifically the lognormal density function given by equation (1.382). 
In this iteration approach, the American (or Bermudan) option prices are obtained without 
necessarily computing the early-exercise boundary. However, this can also be obtained simul- 
taneously. From equation (1.461) we see that equations (1.449), (1.450), and (1.451) give the 
stopping rule 


t* = min{t; i=0,..., N: (S,) = V, (S,)} (1.463) 


the early-exercise (stopping) domain as the union of line segments 


D = Uio... w {Ct 8) : $(S) = V,,(S)}, (1.464) 
and the continuation domain 
C = Ux... l(t, S) : (S) < Ÿ,(S)}- (1.465) 


Relation to Lattice (Tree) Methods 


The dynamic programming approach provides a basis for implementing a number of different 
numerical methods for computing option prices using either Monte Carlo simulations, quadra- 
ture rules of integration, lattice methods, or a combination of such methods. In particular, 
the dynamic programming formulation can be directly related to the simplest of the lattice 
models — the binomial and trinomial lattices. For a detailed exposition on the implementa- 
tions of lattice methods for pricing American options (as well as their European counterparts) 
the reader is urged to take a close look at the relevant numerical projects in Part II. The 
intricate details as well as the relevant equations and algorithms are explicitly described in 
those projects — the reader is also given the opportunity to numerically program the option- 
pricing applications. Here we shall simply give a very brief and generic discussion, meant 
only to emphasize the basic connection between the dynamic programming formulation and 
the lattice pricing models without having to repeat the underlying details. 

Lattice methods can be viewed as either: (i) approximate solutions to recurrence rela- 
tion (1.460) (or alternatively as approximate solutions to the equivalent option-pricing PDE by 
way of finite differences) or (ii) option-pricing models in their own right. Lattice models can 
accommodate time-inhomogeneous processes, as is the case for time-dependent drift and/or 
volatility functions. However, let’s assume time-homogeneous models, where the underlying 
asset or stock price process is essentially modeled as a Markov chain on a discrete set of 
possible states. Generally, one assumes that the stock price can only move on a set of nodes, 
each denoted by a pair of integers (i,j) corresponding to a stock price value S$. The lattice 
is a mesh or grid made up of all such nodes, where the integer j is an index for the spatial 
position of the stock price on the lattice at time t;, i = 0,..., XN. Lattice models allow for 
the implementation of time steps of fixed or variable size, but for the sake of simplicity let’ s 
assume a fixed time step of size ôt = (T — tọ)/N. In fact, most implementations are based on 
equal-size time steps. Then conditional on $, = Si, the probability of a movement of the stock 
price within a single time step ôt from a node (i, j) into a successor node (i+ 1, j’), with value 
Ss. = st, is given by the transition probability value P(S, = S |S, = Si) =p, > 0. 


i+] ti+1 
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Although not critical to the present discussion, we note that for the binomial model there are 
only two successor nodes with j’ = j, j+ 1, whereas the trinomial model has three successor 
nodes with j’ = j—1, j, 7 +1, and so on. 

The positive quantities p;,y are risk-adjusted probabilities and must obviously obey 
probability conservation, 


LPjsr =. for all j, (1.466) 
J 


where the sum is over all successor nodes in the model. Assuming the risk-neutral measure 
with money market as numeraire, the expected rate of return of the stock must equal the 
risk-free rate; i.e., E,[S,,5,] = S,e". This is the risk-neutrality or no-arbitrage condition. For 
the lattice model it takes the form 


i+1 ôt gi 
Dj SH = eS}, (1.467) 
J: 


for all (i, j) nodes, where u =r or u = r — q for nondividend- or dividend-paying stock. 
In order to capture the variance in the asset price returns, the lattice model is also built to take 
into account the asset price volatilty. For instance, one can relate the variation either of stock 
prices or of the log-returns that are computed separately using the diffusion model and the 
lattice model. If the variation or second moment of the log-returns are considered, then we have 
E,[(6 log S,)?] = (o(S,))? ôt within order ôt, where o(S,) is the local volatility function for the 
general case of a state-dependent diffusion model of the form ôS, = w(S,)S, ôt + 0(S,)S, 6W,. 
Applying this same expectation at each node within the lattice model and equating the two 
expectations gives 


(œ) t=) p;p log (Si'1/S'), (1.468) 
Fá 


where gi = a(S) forms a set of volatility parameters. This is just one possible way of intro- 
ducing lattice volatility parameters into the model. Equations (1.466), (1.467), and (1.468) are 
therefore collective constraints on the lattice geometry and the nodal transition probabilities. 
These form an integral part of the construction of the lattice model and its parameters — 
this is part of the model calibration procedure. Further steps in the calibration can also be 
undertaken by fitting the lattice parameters so that certain computed option prices exactly 
match the corresponding market prices. In most applications the number of adjustable lattice 
parameters is greatly reduced. In particular, for geometric Brownian motion there is only one 
volatility parameter, i.e., ai — o. Moreover, most lattice models are simplified by assuming 
that the nodal transitions are independent of the starting node, as is the case for constant 
local volatilities, i.e., pj; — py. For specific details on the contruction of lattices and on 
implementing various calibration schemes for American and European option pricing within 
the binomial and trinomial models, we again refer the reader to the relevant projects in 
Part II. 

Once the lattice geometry and transition probabilities are determined, i.e., the lattice is 
calibrated, the option prices at each node in the lattice, vi = v, (Si), can be determined by 
recurrence: 


Vi = max {$(S'), e DE Big Ve: (1.469) 
J 


The current option price V? = Vo(So) at spot S? = Sọ is obtained by simply iterating over 
N time steps, starting from the known payoff vy = p(s ) at the terminal node values S7 ; 
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Equation (1.469) also divides up the lattice into two groups of nodes: (i) a stopping domain 
as the set {(i, j) : Vj = $(S;)} and (ii) a continuation domain as the set {(i, j) : Vj > 6(Si)}. 
This second set gives the times 7; and spot values S for which the option should not be 
exercised early. According to equation (1.463), the optimal stopping time is 


f = min{t, = iôt : Vi = 6(S!)}. (1.470) 


The early-exercise boundary is then also readily obtained. For instance, for a call this is the set 
of points (iôt, Si), i=0,...,N, where Si = max {Si ; vi > Si — K}; for a put, Si = min{ Si ; 
V >K- Si}. This offers a simple approach for approximating the early-exercise boundary 
curve in the continuous diffusion model corresponding to the limit êt > 0. However, the 
resulting curve will not be smooth, even for relatively small time steps. More accurate 
calculations are afforded by applying more advanced techniques, such as the integral-equation 
approach discussed in Section 1.14.4. For the case of a trinomial lattice, equation (1.469) is 
related to the explicit finite-difference scheme for solving the Black-Scholes PDE. Alternative 
PDE solvers are based on implicit finite-difference schemes. Implicit schemes require the 
solution of a linear system of equations (or matrix inversion) for each time step in the 
propagation, yet they may offer more flexibility in the allowable range of lattice parameters 
for achieving accuracy and numerical stability. We refer the reader to the “Crank—Nicolson 
Option Pricer” project in Part II, which discusses a special type of implementation of the 
Crank-Nicolson implicit scheme for calibration and option pricing on a mesh. 


The Smooth Pasting Condition and PDE Approach 


Although the free-boundary curve is not analytically computable as a function of time, one 
can generally establish the smooth pasting condition. This property guarantees that the price 
function for an American option has a continuous derivative at the exercise boundary and 
that the derivative is equal to the derivative of the payoff function at the exercise boundary. 
The following proposition summarizes this result. 


Proposition. Let D,, with time to expiry T =T —t > 0, be the early-exercise domain for which 
V,(S) = V(S, T) = (S) when S € D,, where ¢ is any differentiable payoff function. Then 
the American option price function V satisfies the smooth pasting condition at the boundary 


denoted by S* (T) = S*: 


VAS, T) 
os 





= $'(S*(7)), (1.471) 


S=S* (7) 
and the zero-time-decay condition obtains on the early-exercise domain, 


VCS, T) _ 
ðr 





0, forSeD.. (1.472) 


Remark: The condition in equation (1.471) is also obviously valid for S € D, (excluding 
the boundary) since V(S, T) = #(S) on that domain. What is important to emphasize here is 
that the derivative is continuous at the boundary of the stopping and continuation domains. 
These properties are valid under general proper It6 diffusion models. For a call (or put), 
then, equation (1.471) simply gives oe = 1 (or —1). This is illustrated in Figure 1.6. 
Although this proposition can be formally proven from the PDE approach, we shall instead 
demonstrate how it arises based on a dynamic hedging strategy argument, which turns out 
to be financially more insightful. First we note that the graph of the American option value 
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FIGURE 1.6 The pricing functions for an American put (left) and an American call (right) with 
continuous dividend yield satisfy the smooth pasting condition with slope equal to — 1 and 1, respectively, 
at the optimal exercise boundary S*(7) for given time to expiry T > 0. 


is never below that of the payoff function. Moreover, for given calendar time f (or time 
to maturity 7), the slope of the graph of V,(S) = V(S, 7) at the exercise boundary point 
S = S*(7) = S* must be less (greater) than or equal to that of the payoff function if the 


latter is an increasing (decreasing) function at the boundary. That is: (i) me l-0 < O'(SF) 


for the case #'(S*) > 0 or (ii) %® Is-5 = $'(S*) for the case $'(S*) < 0. Here we use 








as 

s* to denote the limiting values from the right (+) or left (—) of S*. Our objective is to 
show that these inequalities in the slopes are actually strict equalities. We now show this 
for case (i) as the argument follows in identical fashion for case (ii). In particular, let us 
assume that the asset or stock price at calendar time ¢ is at the boundary; i.e., let S, = S*. 
After an infinitesimally small time lapse ôt, the stock price can move either up into the 
exercise domain D, or down into the (no-exercise) domain of continuation. If the stock 
price moves upward, then its change is 6S, = S,,5,—S* > 0, so S,,5, > Sž and it remains 
in the exercise domain. In this case, V,,5,(S,.5,) = @(S,45,) and the option value changes 
by an amount ôV, = $(S,,5,) — 6(S*) = ¢’(S*)6S,, to leading order in ôt. So to achieve 
a delta hedge for an upward tick over time ôt, the option writer has to buy A, = ¢’(S*) 
shares of the stock. The writer’s delta-hedge portfolio at time ¢ consists of one short position 
in the option and A, shares in the stock. Hence for an upward tick the hedge portfolio 
has value 7, = —V,(S*) + A,S* = —V,(S*) + f’(S*)S*, and the change in portfolio value is 
ôm, = —ÔV, + $'(S*)5S, = 0, to leading order in ôt. On the other hand, if at time f the 
stock ticks down, then ôS, < 0, S,,5, < S*; hence the stock price falls into the domain of 
continuation. Now assume the SDE in equation (1.381) holds. [Note: The same argument 
also readily follows if we assume a more general Itô diffusion with state- and time-dependent 
drift and volatility.] To leading order, then, 














8S, = oS*8W, = —oS*V6t\z|, (1.473) 
where z ~ N(0, 1), since 6W, < O for a downward tick. Now, êV, = ue [ss OS, and, 
using the foregoing expression, the hedge portfolio changes by 

ôm, = —6V,+ f'(S7)6S, 
ôV, (S 
_ YAS — p'(S*) lost V5t\z|. (1.474) 
Os sae 


102 CHAPTER 1. Pricing theory 


Taking expectations and using E[|z|] = ./2/7 gives the expected change in the hedge port- 


folio: 
non- [2e 


We hence conclude that the writer cannot exactly set up a delta hedge portfolio and in 
particular is expected to a a loss every time the underlying stock is in the vicinity of 
the boundary unless ™ AC) = ¢'(S*). Since a me pei = ¢'(S*), the function a = mEn is 
continuous at the r n and we have established equation (1.471). 

The zero-time-decay condition is shown by simply considering the total change in the 
American option value along the boundary S = S*(r) as the calendar time (or time to 
maturity) changes and the boundary point moves accordingly. Along the boundary we have 
V(S*(7), T) = &(S*(7)), and differentiating both sides of this relation w.r.t. r gives (Note: 
The analysis in terms of t is the same): 








p (St | oS*J/8t. (1.475) 


sash 


OV(S*(7), T) L DA aV(S* (T), T) 
Os dt OT 


“ u 





= £'(S*(1)) —— (1.476) 


Hence, using equation (1.471) gives MSO). = 0, and since the option is given by the 


time-independent payoff function everywhere else on the stopping domain, we have equa- 
tion (1.472). 

Delta hedging and continuous-time replication arguments apply to American options in 
the same way they apply to European options. Within the (no-exercise) continuation domain 
we therefore expect and require that the option price function satisfy the Black-Scholes PDE. 
The connection between the optimal stopping time formulation and the PDE approach can be 
shown as follows. Consider recurrence relation (1.460) with time step ôt > 0 for any calendar 
time t < T, 


V,(S) = max {4(S), eE,[V,,41(S,440)18, = S]}- (1.477) 


Assuming V,(S) is sufficiently smooth with continuous derivatives then to leading order 
O(6t), we can Taylor-expand V,,5,(S,,5,) while using It6’s lemma. For a generally state- and 
time-dependent process obeying ôS, = w(S,, t)6t+ 0(S,, t)W,, we have 


av,(S,) VS) 


S, 


S,= s| | + 0((80?) 


V,(S) = max =e a PONE] VS) + (“ee + (5, ) Sd 


aut S,) VE) 








TPS, t) )or +o(S,,1) SW, 





av,(S) 








= max ies „o| 





T + 0((81)?). (1.478) 


BS 


The second equation obtains by evaluating the conditional expectation (which sets S, = S and 
eliminates the ôW, term) and then collecting terms up to O(6t). This expression has been 
written more compactly using the Black-Scholes differential operator (for general drift and 
volatility functions) defined by 





V 
Lgs V = 5S: t) a2 +p(S, ne rV =(L5,—nV. (1.479) 


1.14.2 
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For values of S in the continuation domain, the inequality V,(S) > (S) is satisfied, and 
hence, from equation (1.478) we must have the Black-Scholes PDE: 


aV, (S) 
Ot 





+ Lp5V,(S)=0, forall S ¢ D,. (1.480) 
By specializing to the geometric Brownian motion model, then, w(S, t) = (r — q)S, a(S, t) = 
oS and the Black-Scholes PDE is 


V SPV pi 
ar 2 a Tas 





1V = £V, forall Sg D,. (1.481) 


Thanks to the time-homogeneous property of the solution in this case, we have a PDE in 
terms of the time-to-maturity variable, V = V(S, 7), which will be convenient in subsequent 
discussions. 


Perpetual American Options 


An option with infinite time to maturity is called a perpetual option. Here we consider per- 
petual American calls and puts. These options are instructive since simple analytic solutions 
exist. Moreover, since the exercise boundary S*(7) is a monotonic function of time to matu- 
rity 7 (i.e., increasing for a dividend-paying American call and decreasing for an American 
put), the perpetual option price provides us with the asymptotic limit lim,_,,, S*(7) = S* of 
the exercise boundary for times infinitely far from maturity. We again consider an asset price 
process S, following geometric Brownian motion with constant interest rate r and continuous 
dividend yield at constant rate q. Since a perpetual option has infinite time to maturity, its 
value does not depend on the passage of time; i.e., the price function is independent of 
time. Hence the time derivative of the price function is zero and the Black-Scholes partial 
differential equation (1.481) for the price of a perpetual option reduces to a time-independent 
ordinary differential equation (ODE). 

We first consider the case of a perpetual put struck at K. The price function denoted by 
P(S) must satisfy the ODE 


l o @P dP 
S? S P=0 1.482 
z7 qs t q) as ( ) 





for values away from the exercise boundary, S* < S < oo. The optimal exercise price S* is 
therefore the asset price at which the perpetual American put should be exercised. Since the 
value of the perpetual put must be equal to the intrinsic value at all values of S < S* and 
S* < K, (see Figure 1.6) the boundary conditions on P(S) are 


jim P(S)=0, P(S*)=K-S*. (1.483) 
S* is yet unknown but uniquely determined once P(S) is obtained in terms of S* as described 
just next. Equation (1.482) is an ODE of the Cauchy—Euler (equidimensional) type and 
therefore has the general solution 


P(S) =a,S% +a_S”%, (1.484) 


where a, are arbitrary constants and y, are roots of the auxiliary quadratic equation 








y+ (r—q—-Z)y-r=0. (1.485) 
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Solving for the roots gives 








r £\)+ /(r 724 Ig2r 
Fi Sa (r-4 EP +20?r Aas 


o2 











Assuming positive interest rate r, then y_ and y, are negative and positive roots, respectively. 
To satisfy the first condition at infinity in equation (1.483) we must have a, = 0. By satisfying 
the second boundary condition in equation (1.483), a_ = (K — S*)/(S*)”-, we obtain the price 
function in the form 


P(S) = «K-59(5) s>s. (1.487) 


The exercise boundary value S* can now be determined as the optimal value that maximizes 
the price P(S) for all possible choices of S*. The derivative w.r.t. the parameter S* of this 


price function gives 
oP S\* K—S* 
=—|{ — 1 i 1.488 
as" (5) ( ty z) ee 


Setting this derivative to zero yields the extremum 








K 
ene ae (1.489) 
y_-1 


; tote 4 : PP Ky- (8 ; 
Computing the second derivative at this extremum gives 77 = = (&)™ <0. Hence S* in 


equation (1.489) is a maximum, and inserting its value into equation (1.487) gives the price 
of the perpetual American put in the equivalent forms 


ROS ac) a) 


s SN 
=-7 ($) l (1.490) 








for S > S*. This solution is easily shown to satisfy the required smooth pasting condition 


dP 


—| =i. 1.491 
ds (1.491) 


S=S* 





Next we consider the perpetual American call struck at K. As in the case of the put, the 
price function now denoted by C(S) also satisfies equation (1.482), but for values 0 < S < S*. 
The optimal value S* is therefore the asset price at which the call should be exercised. The 
value C(S) must be given by the intrinsic value of the call pay-off for values on the boundary 
S > S*, where S* > K; hence the boundary conditions are 


lim C(S)=0, C(S*)= S*—K. (1.492) 
The general solution is again given by equations (1.484) and (1.486). However, by satis- 


fying the boundary conditions in equation (1.492) we now instead have a_ = 0 and a, = 
(S* — K)/(S*) +, giving 


C(S) = (S* (=) 0<S<S*. (1.493) 


1.14.3 
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Using the same procedure as for the put, the optimal exercise boundary is determined by 
finding the maximum of C(S) w.r.t. S*, giving 


Ky, 
yy 





= (1.494) 


Using S* from equation (1.494) in equation (1.493) gives the price of the perpetual American 
call, written equivalently in terms of K or S*: 


= K y¥,-1 Y+ S Y+ 
opal) x) 


S* S V+ 
= = (=) (1.495) 
+ 





This satisfies the required smooth pasting condition 


dC 


—| s1. 1.496 
1S (1.496) 


S=S* 





It is instructive to examine what happens to the exercise boundary in the two separate 
limiting cases: (i) zero interest rate r =O and (ii) zero dividend yield q = 0. In case (i) 
we have from equation (1.486) that y_ = 0 (assuming q > —o7/2, which is the case if 
q = 0). From equation (1.489) we see that S* = 0; hence, for zero interest rate the perpetual 
put is never exercised early. This is consistent with the property of an American put for 
r = 0 and for any finite time to maturity, as shown in the next section. From a financial 
standpoint, there is no time value gained from an early pay-off with zero interest. For case (ii): 
Equation (1.486) gives y, = 1 (assuming r > —o*/2, which is the case for r > 0). Moreover, 
Y, —> 1* as q— 0* and from equation (1.494) we have S* — oo. Hence in the limit of zero 
dividend yield the perpetual call is never exercised early, irrespective of the interest rate. 
This feature is also consistent with the plain American call of finite maturity, as shown in 
the next section. 


Properties of the Early-Exercise Boundary 


The perpetual American option formulas of the previous section already allowed us to 
determine the precise behavior of the optimal exercise boundary in the asymptotic limit of 
infinite time to expiry, i.e., as T —> oo. To further complete the analysis of the boundary we 
now consider the opposite limit, of infinitesimally small positive time to maturity T —> OT. 
In particular, let us consider the case of the Amercian call struck at K with continuous dividend 
yield q and price function denoted by C(S, K, T) at spot S. Since C(S, K, T) is an increasing 
function of 7, for T > 0, the graph of the American call price (plotted as a function of S) with 
greater time to maturity 7, must lie above the graph of the price function for the corresponding 
call with time to maturity 7, < 7,. Furthermore, the smooth pasting condition guarantees that 
the price functions join the intrinsic line at levels S*(7,) — K and S*(7,) — K, respectively, 
giving S*(7,) < S*(7,). Hence, we conclude that S*(7) is a continuously increasing function 
of positive r. To put this in financial terms, an American call with greater time to maturity 
should be exercised deeper in the money to account for the loss of time value on the strike 
K. Due to the fact that one would never prematurely exercise at a spot value below the strike 
level (i.e., exercising for a nonpositive pay-off), the early-exercise boundary for an Amercian 
call must, in addition, satisfy the property S*(7) > K for all 7 > 0. 
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To determine the boundary in the limit rT —> 0*, note that the option value approaches the 
intrinsic value; i.e., at expiry it is exactly given by the payoff function C(S, K, T = 0) = S — K 
for values on the exercise boundary. Inserting this function into the right-hand side of 
equation (1.481) and taking derivatives gives 


oC(S, K, 0*) 
OT 





= rK —qS (1.497) 


for S > K. Since the condition dC(S, K, 0+)/d7 > 0 ensures that the option is still alive (i.e., 
not yet exercised), the spot value S at which dC(S, K, 0*)/ðr becomes negative and hence 
for which the call is exercised at an instant just before expiry is given by S = 4K. This is 
the case, however, if the value 7K is in the interval S$ > K, that is, if r > q > 0. In this 
case, just prior to expiry the call is not yet exercised if the spot is in the region K < S < 3K 
but would be exercised if S > =K. Hence, S*(0*) = ak for r > q > 0. In the other case, 
r<q,so 7K < K. Yet S > K, so S*(0+) = K for r < q. Note that the condition S*(0*) > K 
is not possible in this case because this leads to a suboptimal early exercise, since the loss 
in dividends would have greater value than the interest earned over the infinitesimal time 
interval until expiry. Combining these arguments we arrive at the general limiting condition 
for the exercise boundary of an American call just prior to expiry: 


lim S*(7) = max(K, K). (1.498) 
t—>0+t q4 


From this property we see that S*(0*) > œ as q —> 0. Hence, for zero dividend yield 
the American call is never exercised early, which is consistent with the fact that the plain 
(nondividend) American call has exactly the same worth as the plain European call. 

Similar arguments can also be employed in the case of the Amercian put struck at K with 
continuous dividend yield q. At expiry the put has value P(S, K, T = 0) = K — S for values 
on the exercise boundary. We leave it as an exercise for the reader to show that the exercise 
boundary of an American put just prior to expiry is given by 

lim S*(7) = min(K, K). (1.499) 

70+ q 
For r = 0 we therefore have S*(0*) = 0, irrespective of the value of q. Since S*(7) is a 
decreasing function of 7, we conclude that the early-exercise boundary is always at zero, 
meaning that the American put with zero interest rate is never exercised before maturity. This 
is consistent with the conclusion we arrived at earlier, where we considered the perpetual 
American put. For q < r we observe that the early-exercise boundary just before expiry is at 
the strike, S*(0*) = K. A special case of this is the vanilla American put, i.e., when r > 0 
and g = 0. Figure 1.7 gives an illustration of typical early-exercise boundaries for a call and 
put. Given a time to maturity of T at contract inception, we see that the American call with 
nonzero dividend is not yet exercised (i.e., is still alive) on the domain of points (S, 7) below 
the exercise curve: S € [0, S*(7)) and 7 € (0, T]. In contrast, the American put is kept alive 
above the exercise curve: S € (S*(7), 00) and 7 € (0, T]. 


1.14.4 The Partial Differential Equation and Integral Equation Formulation 


The problem of pricing an American option can be formulated as an initial-value partial 
differential equation (PDE) with a time-dependent free boundary. The early-exercise boundary 
is an unknown function of time, which must also be determined as part of the solution. 
In particular, let V(S, 7) represent the pricing function of an American option with spot S 
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FIGURE 1.7 Early-exercise smooth boundary curves S = S*(7) for the American call (left), with q > 0, 
and put (right), with values depicted just before expiry 7 —> O”. In the limit of infinite time to expiry, 
the curves approach the horizontal asymptotes at S = S*, where S* is given by equation (1.494) or 
equation (1.489) for the call or put, respectively. 


and time to maturity 7, 0 < 7 < T, and having payoff or intrinsic function V(S, 0) = (5S). 
Here we assume the pay-off is time independent, although the formulation also extends to 
the case of a known time-dependent payoff function. For given 7, the solution domain is 
divisible into a union of two regions: (1) a continuation region (S,7) € D! x [0, T], for 
which the option is still alive or not exercised, and (2) a stopping region (S, T) € D, x [0, T], 
where D, is the complement of D? within R}, for which the American option is already 
exercised. The domains depend on 7. As seen in the previous section, in the case of the 
American call, (S) = S—K on D, = [S*(7), œ) (and D} = (0, S*(7)), while for the put, 
o(S) = K—S on D, = (0, S*(7)] (and D! = (S*(7), 00). Assuming the underlying asset 
follows equation (1.381), equation (1.481) holds for $ € D'. In contrast, the homogeneous 
Black-Scholes PDE does not hold on the domain of the early-exercise boundary, where the 
American option is given by the time-independent payoff function V(S, T) = (S). Since 
228) = 0, the solution on D, satisfies x = 0. Combining regions and assuming the pay-off 
is twice differentiable gives a nonhomogeneous Black-Scholes PDE: 





ovis, 
2 D _ £,.V(S, 7) +f(S,7), (1.500) 
with (source) function 
0, SED 
f(S, 7) = (1.501) 


—Ly50(S), SED, 
where Lz, is the Black-Scholes differential operator. For geometric Brownian motion, Lz, is 
defined by equation (1.481). Given the function f(S, T), whose time dependence is determined 
in terms of the free boundary, the solution to equation (1.500), subject to the initial condition 
V(S, T = 0) = &(S) and boundary conditions V(S = 0, T) = (0), V(S = œ, T) = (œ), can 
be obtained in terms of the solution to the corresponding homogeneous Black-Scholes PDE. 
Recall from previous discussions that the transition probability density function p(S’, S; 7) 
solves the forward Kolmogorov PDE in the S’ variable and the backward PDE in the spot 
variable S with zero boundary conditions at S = 0, œ for all 7 > 0. As already mentioned, 
for process (1.381) p is just the lognormal density given by equation (1.382). We also know 
that e™™ p solves the homogeneous Black-Scholes PDE. Combining these facts and applying 
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Laplace transforms, one arrives at the well-known Duhamel’s solution to equation (1.500) in 
the form 


vs, =e f ” ACS, S; 1)6(S\d5' 


af e | f DSS OKS, T- ras Jar 
0 0 
=V,(S, 7) + V*(S, 7). (1.502) 


One can readily verify that this solves equation (1.500), even for the more general case 
of state-dependent models (see Problem 1). An important aspect of this result is that the 
American option value V(S, 7) is expressible as a sum of two components. The first term is 
simply the European option value V,, as given by the discounted risk-neutral expectation of 
the pay-off. Hence the second term, denoted by V°(S, 7), must represent the early-exercise 
premium, which gives the holder the additional liberty of early exercise. 

Assuming geometric Brownian motion for the underlying asset, equations (1.500) 
and (1.501) for the American call and put specialize to 








sede 0, S< S*(7) 
OPEP agi oE sia 
m Oe se ae 
gS—rK, S>S*(7) 
and 
{rK—gS, S<S* 
aP_ SPP iP pe SE CA hess 
az age ag = i ' 


0, S>S*(7) 
respectively. Here we used £gs(S— K) = rK — qS, and S*(7) denotes the early-exercise bound- 
ary for the respective call and put with strike K. The right-hand sides of these nonhomogeneous 
PDEs are nonzero only within the respective stopping regions. Using equation (1.502), the 
solutions to equations (1.503) and (1.504) for the American call and put price are given by 
C(S, K, T) = C,(S, K, 7) +C°(S, K, T) (1.505) 
and 


P(S, K, T) = P(S, K, 7) + P°(S, K, 7), (1.506) 


where the respective early-exercise premiums take on the integral forms 
C°(S, K,T) = f a p(S’, S; T)(qS' — rK)as | dr (1.507) 
0 S*(7—1') 
and 
T ; S*(7-1') 
P°(S, K, T) = f e” | D(S', S; 7')(rK — asas | dr. (1.508) 
0 0 
These premiums can also be recast as 


C°(S,K,1) = [ 277 Bal (G8 FR ie >se- ae (1.509) 
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and 


T 


P*(S, K, 7) = [ e~" Ey[ (rK — 4S7) lis, ssa- ladr, (1.510) 


where E, denotes the current-time expectation, conditional on asset paths starting at Sọ = 
S under the risk-neutral measure with density p(S,, 5S; 7’). The time integral is over all 
intermediate times to maturity, and the indicator functions ensure that all asset paths fall 
within the early-exercise region. The properties of the early-exercise boundaries established 
in the previous section guarantee that the early-exercise premiums are nonnegative. For a 
dividend-paying call, equation (1.498), together with the indicator function condition, leads to 
Sy max(7K , K) > ¿K ; hence gS,, — rK > 0 and C“ is positive. A similar analysis follows 
for the put premium. The exercise premiums hence involve a continuous stream of discounted 
expected cash flows, beginning from contract inception until maturity. This lends itself to 
an interesting financial interpretation, as follows. Consider the case of the American put 
(a similar argument applies to the dividend-paying call) and an infinitesimal intermediate 
time interval [7’, T" + dT]. Then from the holder’s perspective the option should be optimally 
exercised if the asset price, given by S~ at time 7’, attains the stopping region (i.e., reaches the 
early-exercise boundary with S, < S*(T— T’) and 7 — T as the remaining time to maturity). 
Assuming that the holder is instead forced to keep the American put alive until expiry, the 
holder would have to be fairly compensated for the loss due to the delay in exercising during 
the time interval dt’. The value of this compensation is the difference between the interest 
on K dollars and the dividend earned on the asset value S», continuously compounded over 
time dt’. This cash flow is an amount (rK — qS_,,)d7’, and corresponds to the early-exercise 
gain if the holder in fact had the privilege to optimally exercise. Allowing for all possible 
asset price scenarios from S to S, that attain the boundary gives rise to the expectation 
integral under the risk-neutral density for all intermediate times 0 < 7’ < 7. Summing up 
all of these infinitesimal cash flows and discounting their values to present time by an 
amount e~’” gives the time integral, as in equation (1.508) or (1.510). We conclude that the 
early-exercise premium has an equivalent and alternative interpretation as a delay-exercise 
compensation. 

The foregoing integral representations for the American call and put price can also be 
applied to cases where the volatility of the asset price process S, is considered generally state 
dependent. In order to implement the integral formulas, we need to be able to compute the 
transition density function p, either analytically or numerically. Moreover, the integrals can 
only be computed after having determined the early-exercise boundary S*(7’) for 0 < T <7. 
For the geometric Brownian motion model (with constants r,g,@), p is given by the lognormal 
density, and the foregoing double integrals readily simplify to single time integrals in terms 
of standard cumulative normal functions. In particular, one readily derives explicit integral 
representations for the price of the American call and put (see Problem 2): 


C(S, K, 1) = Se“ N(d,) — Ke" N(d_) 


+ i [qSe- 4) N(d* (7) — Ke") N(d* (7')) dz’, (1.511) 
0 


P(S, K, T) = Ke" N(—d_) — SeT N(—d,,) 


+ f [rK M(—d* (1')) — qS Nd (1')) Jd’, (1.512) 
0 
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where 








log $+(r—q+}o’)r 
d= g 
z O/T 
7 log ep t (r-q+50°) (T-T) 
ov T=T l 


(1.513) 











d*(1') (1.514) 





These integral representations are valid for S € (0,00), 7 > 0. By setting S = S*(7) and 
applying the respective boundary conditions, C(S*(7), K, 7) = S*(7) — K for the call and 
P(S*(7), K, T) = K — S*(7) for the put, equations (1.511) and (1.512) give rise to integral 
equations for the early-exercise boundary. For the call, 


S*(1) — K = SeT N(d,,) — Ke" N(d_) 
+ f [qS N(d* (7')) — Ke") N(d* (7')) Jd’. (1.515) 
0 
and separately for the put, 
K —S*(1) = Ke“ N(—d_) — SeT N(—-d,) 
+ f [rK N(—d* (7')) — qS Nd (7')) 7’, (1.516) 
0 


where 


S* 




















~ log @4(r—gqt1o’)r 
a,=—-= Cease) (1.517) 
oft 
i log 50 + r—qtio’)(r-7 
a(7)=—=2 l 20”) (1.518) 





osr r 


Note that equations (1.515) and (1.516) involve a variable upper integration limit and the 
integrands are nonlinear functions of S*(T), S*(7’), T and 7’. From the theory of integral 
equations, equations (1.515) and (1.516) are known as nonlinear Volterra integral equations. 
Note that the solution S*(7), at time to maturity 7, is dependent on the solution S*(7’) from 
zero time to maturity 7’ = 0 up to T’ = 7. Although equations (1.515) and (1.516) are not 
analytically tractable, simple and efficient algorithms can be employed to solve for S*(7) 
numerically. For detailed descriptions on various numerical algorithms for solving these types 
of integral equations, see, for example, [DM88]. A typical procedure divides the solution 
domain into a regular mesh: 7) = 0, T; = ih, i=1,...,n, with n steps spaced as h = T/n. 
By approximating the time integral via a quadrature rule (e.g., the trapezoidal rule), one 
obtains a system of algebraic equations in the values S*(7;), which can be iteratively solved 
starting from the known value S*(7)) = S*(7 = 0+) at zero time to maturity. Alternatively, 
popular Runge-Kutta methods usually used for solving initial-value nonlinear ODEs can be 
also adapted to these integral equations. Once the early-exercise boundary is determined, the 
integral in equation (1.511) or (1.512) for the respective call or put can be computed. In 
particular, a quadrature rule that makes use of the computed points S*(7;) can be implemented. 
Accurate approximations to the early-exercise boundary are obtained by choosing the number 
n of points to be sufficiently large. 
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Problems 


Problem 1. Consider the state-dependent model dS, = u.(S,)dt+o(S,)dW,. Assuming f(S, 7) 
is differentiable w.r.t. 7, show that equation (1.502) satisfies equation (1.500) for the appro- 
priate operator £,,. Hint: Since V; satisfies the homogeneous Black-Scholes PDE, from 
superposition one need only show that V° satisfies equation (1.500). Use the property of inter- 
changing order of differentiation and integration, integration by parts, and the fact that e~” p 
satisfies the homogeneous Black-Scholes PDE with initial condition p(S’, S; 0) = 6(S’— S). 
Provide an extension to equation (1.502), if possible, for the more general case of explicitly 
time-dependent drift and volatility. 


Problem 2. (a) By employing similar manipulations as were used to obtain the standard 
Black-Scholes formulas in Section 1.6, derive equations (1.511) and (1.512) from equa- 
tions (1.507) and (1.508). (b) Show that the pricing formulas for the American call and put in 
equations (1.511) and (1.512) satisfy the required boundary conditions at S = 0 and S = œ. 


Problem 3. Find an analytical formula for the price as well as the early-exercise boundaries 
of a perpetual American butterfly option with payoff function ô.(S— K) given by equa- 
tion (1.228) of Section 1.8. Assume K — € > 0 and that the underlying asset price obeys 
geometric Brownian motion with constant interest rate r and continuous dividend yield q. 


Problem 4. Using equations (1.511) and (1.512), derive integral representations for the delta, 
gamma, and vega sensitivities of the American call and put. 


Problem 5. Let V(S,7) and V,(S,7) denote the American and European option values, 
respectively, with spot S, time to maturity 7, and payoff function #(S). Assume a constant 
interest rate r and continuous dividend yield q under the geometric Brownian motion model 
for the process S,. Prove the equivalence of these two statements: 


(i) V(S, T) > Vg(S, T) for all S > 0, 7 > 0. 
(ii) (S) > e~'7h(e°-9"S) for some point (S, 7). Explain why American options on futures 
have a nonzero early-exercise premium. 


Problem 6. Consider a Bermudan put option with strike K at maturity T with only a single 
intermediate early-exercise date T, € [0, T]. Assume the underlying stock price obeys equa- 
tion (1.381) within the risk-neutral measure, and let P(S,, K, T — t) denote the option value 
at calendar time t with spot S,. Find an analytically closed-form expression for the present- 
time t = 0 price P(S), K, T). Hint: This problem is very closely related to the valuation of 
a compound option discussed at the end of Section 1.12. In particular, proceed as follows. 
From backward recurrence show that 


P(So, K, T) = e"E,[P(Sy,,K, T—T,)], (1.519) 
with 
P,(S7,,K,T—T;), Sz, > Si, 
P(S, K, T—T,) = (1.520) 
K — Sr,» Sr, < SF» 


where P, is the European put price function, Eo[] is the risk-neutral expectation at time 0, 
and the critical value $}, for the early-exercise boundary at calendar time T, solves 


P,(S},,K,T —T,) =K — S$}. 
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Compute this expectation as a sum of two integrals, one over the domain Sy, > $7, and the 
other over 0 < S7, < Sj, while using equations (1.382) and (1.385) to finally arrive at the 
expression for P(S), K, T) in terms of univariate and bivariate cumulative normal functions. 
Show whether S7, is a strictly increasing or decreasing function of the volatility ø, and 
explain your answer. What is this functional dependency for the case of a Bermudan call? 
Explain. 


CHA.P-T ER #2 


Fixed-Income Instruments 


2.1 Bonds, Futures, Forwards, and Swaps 


2.1.1 


Bonds 


A bond is paper issued by a corporate or sovereign entity promising a cash flow stream at 
future dates. In this chapter, we make the important assumption that credit risk is negligible, 
meaning that the probability that bond issuers default on their promise of making payments 
is zero. 

Mathematically, a bond is modeled as a cash flow stream with a present value. The 
cash flow map of a bond is given by a sequence of pairs (c, T) = (c;, T;) i = 1,..., N, 
where T) <--- < T, are future cash flow dates in increasing order and c,,...,c, are the 
corresponding cash flow amounts. A cash flow stream (c, T) has a present value at calendar 
time t denoted by PV,(c, T). Pure discount bonds, or zero-coupon bonds, are securities with 
one single cash flow of fixed amount, i.e., the nominal amount N at maturity T; see Figure 2.1. 
The continuously compounded yield y,(T) for the period [t, T] is often used to express the 
value Z,(T) at time t of a zero-coupon bond maturing at time T and is defined as follows: 


Z,(T) = exp(—y,(T)(T -9). (2.1) 


Note that Z;(T) = 1. Simple-compounding rules are often used. The simply compounded 
yield y% with period a < T — t is defined as follows: 


Z,(T) = ES (2.2) 


For example, letting a = (T —t)/n,n > 1, gives n simple compounding periods in [f, T]. 
Notice that in an economy where one postulates that the cost of holding a cash position is 
negligible — which is the case if one neglects security costs — one obtains the inequality 


Z,(T,) = Z), (2.3) 


for all maturities T, < T, and any fixed present time t < T}. 
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FIGURE 2.1 Zero-coupon bond with one cash flow at maturity T. 
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FIGURE 2.2 Cash flow stream for an n-coupon bond. 


A cash flow stream (c, T) of multiple n-coupon payments can be replicated by means 
of a portfolio of zero-coupon bonds. Figure 2.2 depicts such a cash flow stream with equal 
payments until maturity, at which time a nominal payment in the amount of N is made. 
Assuming that zero-coupon bonds of all maturities are traded, the present value of the given 
cash flow stream is given by the sum of discounted cash flows: 


y or) < -zù 
PV (c, T)= >> ceh- (x) y c(1 + ay! (T,)) = 


i=1 i=1 


(2.4) 


where the first sum in the equation assumes continuous compounding and the second assumes 
simple compounding. One defines yields of a coupon bond with cash flow map (c, T) to be 
the quantities y,(c, T) [or yo (c, T) for simple compounding] such that 


PV,(c, T) =E ger 2c (l+y¥(eT)) E, (2.5) 


i=1 i=1 


where, again, the first sum in the equation assumes continuous compounding and the second 
assumes simple compounding. 

Besides coupon bonds, some instruments with uncertain cash flows can also be priced in 
terms of the zero-coupon bonds. An example is a bond-forward contract. This is a forward 
contract on a zero-coupon bond of given maturity T,, with a future settlement date 7,. Two 
parties A and B agree, at present time f, that a prescribed interest rate will apply within some 
interval [T,, T,] in the future, with t < T, < T,. A bond-forward of nominal N is equivalent 
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FIGURE 2.3 A comparison of equivalent present-value cash flows for an FRA with payments in 
arrears and in advance. The three figures correspond to the three possibilities of designing the cash 
flows: either both occurring at T}, or both at T,, or one at T) and one at T}. 


to the combination of two cash flows, as depicted in Figure 2.3. Party A pays an amount N 
at time 7,, and after a time T she receives an amount 


NO +af\(T,, T))* È N exp (tf,(T,, T)) (2.6) 


at time T,. Here, T= (T, — T,) is the tenor and fe (Ti, T>) is the forward rate computed with 
a simple-compounding rule of period a < 7, while f,(T,, T,) uses continuous compounding 
as further explained below. Notice that in the limit when the forward maturity is at current 
time, i.e., when T, = t, forward rates coincide with yields, i.e., 


FET) =P (T), (2.7) 


and y, (T3) = f,(t, T,) if continuous-compounding is assumed instead. The most convenient 
compounding convention for forward rates is the one with an intermediate compounding 
period equal to the tenor, i.e., œ = T. The equilibrium value of the forward rate is the 
rate for which the present value of the bond-forward contract is zero. Assuming continuous 
compounding, the present value of the two cash flows is 


PV, = —NZ,(T,) + Net 0 Z (T3), (2.8) 
whereas for simple compounding 
PV, = MZ,(T,) — Z,(T,)) + Ntfi? (Ty, T3)Z (T). (2.9) 


The equilibrium rate corresponds to the value for which PV, = 0, hence giving 


1 ZAT;) 
f(T), Th) = Loe (Z), (2.10) 





This coincides with the continuously compounded forward rate for the interval [T,, T>] as 
viewed at present time t. In contrast, for simple compounding the equilibrium rate (or forward 
rate) denoted by FOT, T>) satisfies 


Z,(1\) 
Z (T) 





1+Tf® (T, T) = (2.11) 
Note that the forward rate is also related to the forward price for a unit zero-coupon bond 


maturing at time T, with settlement at time 7,. Forward rates and forward prices are further 
discussed in later sections. 
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Forward Rate Agreements 


A forward rate agreement (FRA) is an instrument with the same risk profile, cash flow map, 
and present value of a bond-forward, but with only one actual cash flow. Such FRAs are 
struck at the equilibrium forward rate at the time of issue and come in two flavors, since 
payments can be either in advance or in arrears. In an FRA with payments in arrears, struck 
at the equilibrium rate f,(7,, T,), there is only one cash flow (with positive and negative 
components) at time T,. Using equation (2.8), or (2.9), and inflating the cash flow at time T, 
into a cash flow at time T, gives only one cash flow at time 7,, of amount 


NTF? Ti, Ts) = yn (T2)] (2.12) 
for simple compounding or 
N[et TT — evn T] (2.13) 


for continuous compounding. In contrast, in a similar FRA with payments in advance, the 
cash flow occurs only at time 7,. Discounting the cash flow at time T, back to time T, gives 
the following payoff amount for an FRA with payments in advance: 


N (el TT on T) _ 1) (2.14) 


for continuous compounding or 





© 
n( 2h (TT) 1) (2.15) 


1+ ryh (T) 
for simple compounding. The cash flows for these FRAs are depicted in Figure 2.3. 


Problems 
Problem 1. Prove that the condition (2.3) implies that all forward rates are nonnegative. 


Problem 2. Conversely, prove that if all forward rates are positive, then the discount function 
is monotonically decreasing, i.e., that condition (2.3) holds. 


Floating Rate Notes 


A floating rate note (FRN) is an instrument with a series of settlement dates T, = To + jT, j = 
0,...,n, at which cash flows occur. In contrast to a bond, the size of a cash flow c(T;) 
(ie., the coupon payment) at the generic date T, depends on the interest rate prevailing at 
time 7; or earlier. In the simplest, so-called plain-vanilla structures, cash flow amounts are 
defined in a manner that the FRN can be associated to a cash flow map and priced directly 
off the yield curve, i.e., with no volatility risk. There are two variations of FRNs. Either the 
coupon payments are settled in arrears, i.e., paid out at time T, based on the rate for the 
period that just ended, (T;—7, T;], or they are settled in advance with payments at time T;_,. 
A plain-vanilla FRN with payments in arrears has cash flows given by 


c(T;) = tNyy)_,(T;) + NB jp. (2.16) 


Here N is the notional amount of the FRN and 6,, equals 1 in case j = n and 0 otherwise; 
hence, the second term in equation (2.16) represents the notional repayment, which takes 
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place only at the time of maturity 7,,. For an FRN with payments in advance, the cash flows 
for times T, < T, are obtained by discounting at the rate ye ay hence, 


Ny? (T) 


Tj=r 


e(T,) = —— 
O 1+ (T) 


(2.17) 


if j< n and c(T,) = N at maturity. Note that here we are assuming simple compounding 
with fixed period 7. The present value at time t < Tọ is the same in either case. In particular, 
with payments in arrears we have 


FRN, = Ye(T)Z A(T) 


j=0 


= 0(Tp)Z,(To) + Nt y ,(T)ZAT) + NZ,(T,). (2.18) 


j=l 
This expression simplifies by using the relation 

OFIRI) = ZT) (2.19) 
in the above sum, which collapses to give 

FRN, = NZ,(T)) + c(Ty)Z,(Tp)- (2.20) 
In financial terms, this follows from the fact that if one has the notional amount available 
at time 7, and invests it in a series of term deposits of tenor T until maturity, one generates 


all the cash flows corresponding to the coupon payments starting from the initial and the 
principal repayment. This is depicted in Figure 2.4. 


Plain-Vanilla Swaps 


A payer’s interest rate swap can be regarded as a combination of a short position in a floating 
rate (the floating leg) and a long position in a bond (the fixed leg) with the same nominal or 


floating rate note 
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FIGURE 2.4 Equivalent cash flows for an FRN. 


118 CHAPTER 2. Fixed-income instruments 


principal amount N and paying coupons at a preassigned fixed rate r°. A receiver’s interest 
rate swap can be regarded as a short payer’s swap. Cash flow dates are at times T, = To + JT, 
j=0,...,n, with period 7. Clearly, swaps can be priced directly from the isla: curve, and 
their replication does not involve any volatility risk. Swaps come in two variations, with the 
floating rate (typically a six-month LIBOR) agreed to be the rate prevailing either at the 
beginning or at the end of each period (7;_,, T;]. Assuming a principal repayment of N at 
time T,„, the present value at time ¢ of the fixed leg is 


PV? = c™4(T))Z,(T)) + Nr $ TZ, (T) + NZ,(T,) (2.21) 


j=l 


and that for the floating leg is 


PVE = cl (Ty)Z,(To) +N 2 y 7(Tj)Z,(T;) + NZ, (T,), (2.22) 
j=1 


with simple compounding at the floating rate assumed. From arbitrage arguments it also 
follows that the yields in this equation are given by the forward rates f; (T, -i T;). 

The swap rate r; is said to be at equilibrium at time t if the present value to the receiver 
or payer of the swap at time f is zero, i.e., if pyi — pyiet, More precisely, using algebra 
similar to what was used in the preceding section, on FRNs [i.e., using equation (2.19)], 
the equilibrium swap rate of a swap with payments in arrears can be shown to satisfy the 
following equation: 


N(Z,(To) = Z,(T,,)) + (f (Tp) = CT) (To) = Nr; 3 1Z,(T)). (2.23) 


j=l 
Assuming equal initial coupons c(T)) = c°4(T,), we have 


r= Z,(To) F Z,(T,) 2 24) 

t 5 
Det 1Z,(T;) 

It is important to note that this result is independent of any assumed short rate model. Also, 

from the cash flow structure one can observe that interest rate swaps may be decomposed in 

terms of FRAs. Figure 2.5 shows the basic cash flow map of a receiver’s swap with variable 

positive cash flows and the corresponding negative fixed amounts. 


Constructing the Discount Curve 


In this section, we describe the most liquid classes of interest-sensitive assets. These instru- 
ments can be priced directly from the discount curve and owe their popularity to the relative 
ease of replication, which results in liquid, efficient markets. Conversely, prices of such assets 
are used to reverse information on the discount curve. The discount curve is found by an 
interpolation algorithm, subject to the requirement that the present values P, of a series of 
cash flow maps ¢;, i= 1,...,, is reproduced, that is, subject to 


i? 


P; = 2 cy; (T;)Zo(T;), (2.25) 


where 7;; is the time when the jth cash flow of the ith cash flow map occurs and c;; is the 
corresponding amount. 
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FIGURE 2.5 Fixed-leg and floating-leg cash flows for a receiver’s swap. 


A variety of analytical methods can be used to imply the discount curve. The following 
is a possible strategy that works quite well for the LIBOR curve. The method consists of two 
steps. In the first step one finds a best fit in a special parameterized family of meaningful 
discount functions. A possibility is to use the CIR discount function Z5'*(T), introduced in 
the following sections, but other choices would work as well. As a second step, one can 
represent the discount curve as 


Z)(T) = ZP (T) + 6Z)(T) (2.26) 


and find the correction, 6Z)(T) = Z)(T) — Z§® (T), in such a way that the present values of 
the cash flow map in equation (2.25) are exactly reproduced, forward rates are positive, and 
the function 6Z,)(T) is as smooth as possible. 

Cubic splines can be used to represent the function 6Z)(T). A cubic spline is parameterized 
by the function values and the second derivatives on a time grid 7,,...,7,,. The value of 
6Z,(T) for time T € (T,,T,,,) falling in between the grid points can be interpolated as 
follows, using a cubic polynomial: 


ÔZo(T) = aa(T = Ta) + balT — Ta) + CalT Ta) +da. (2.27) 
The constants a,, ba, Cy, da solve the equations 
d, = 6Z)(T,), 2b, = 8Z5(T,); (2.28) 
a (Ta41 — Ta) + Ba (Tat — Ta)” + Ca(Tat1 — Ta) +da = SZ (Tai), (2.29) 
64g (Ta41 — Ty) + 2Bq = ÔZo (Tayı). (2.30) 


This set of equations, in the given coefficients for each œ grid point, involves function 
evaluations at both times T, and T,,,, some of which correspond to points outside the 
discount curve. Hence, the equations constitute an underdetermined linear system. A good 
way to select a satisfactory solution is to further require that the weighted sum of squares 


n 


E (Zo (T)? HAZE (Ta) (2.31) 


a=1 
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FIGURE 2.6 An actual-yield curve versus the yield curve obtained using a CIR discount function. The 
actual forward rates curve is also drawn for comparison. 


be minimal. The parameter A adjusts the so-called tension of the yield curve. The limit A > 0 
corresponds to an infinitely tense curve, in which the discount factors are linearly interpolated 
between the vertices. In the limit A —> oo, sharp turns in the curve are highly penalized. The 
spreadsheet (related to the “Interest Rate Trees: Calibration and Pricing” project of Part II) 
can be worked out by the reader interested in implementing the details of this fitting scheme, 
as depicted in Figure 2.6. 


2.2 Pricing Measures and Black-Scholes Formulas 


In Section 1.12 we derived pricing formulas of the Black-Scholes type assuming interest rates 
are deterministic functions of time. In this section, we lift this restriction and find Black— 
Scholes type of models that are solvable, giving explicit pricing formulas for stock options 
with stochastic interest rates and a number of interest rate derivatives. Pricing models for 
interest rate derivatives are based mostly on the postulate that interest rates and the discount 
function follow a diffusion process, thus ruling out jumps. In a general diffusion model, the 
price process for discount bonds Z,(T) of the various maturity dates T obeys a stochastic 
differential equation of the following form: 


dZ, (T) = (r,+.94,00)Z,(T)dt + Z, (T)o P dw,. (2.32) 


Here, q, is a price of risk component dependent on the chosen numeraire, while oy (D is the 
zero-coupon bond (lognormal) volatility. 

Recall that the pricing formula in the asset pricing theorem (covered in Chapter 1) provides 
a way to express prices in terms of discounted expectations of future pay-offs with respect to 
a pricing measure: 


A 
A, = g,E2® =| , (2.33) 


Dek 
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In this formula, “discounting” is achieved through a numeraire asset g, whose volatility 
is the price of risk for the pricing measure denoted by Q(g). The actual asset price A, 
is independent of g; changing the numeraire is equivalent to changing coordinates in path 
space. Recall that all domestic assets drift at the instantaneous domestic risk-free rate plus a 
price of risk component given by the dot product o,-o,, where o, and o, are lognormal 
volatility vectors of the chosen numeraire g, and the asset price A,. As the following example 
demonstrates, it is useful to select the appropriate numeraire asset in order to derive pricing 
formulas in analytically closed form. The choices of numeraire asset we use in this section are: 


e Risk-neutral measure, corresponding to selecting g, = B, = elo'=4s the money-market 
or savings account 

¢ Forward measure with maturity T, (also called the T-forward measure) corresponding 
to selecting g, = Z,(T), the zero-coupon bond price with maturity date T 

e Bond-forward measure with cash flow map (c, T), corresponding to selecting the 
bond’s present value: 


p= ez). (2.34) 


To achieve solvability, it is also necessary to identify an appropriate stochastic process 
whose expectation at maturity time one proposes to compute. As the following examples show, 
sometimes the obvious choice of the process is not the most convenient for the calculations. 
Furthermore, one needs to postulate a stochastic differential equation for the selected process 
whereby the drift is simple to compute (possibly zero) and the volatility is a deterministic 
function of time under the chosen measure. In the following sections we argue that there is 
a large class of models — known as Gaussian models — that naturally lead to deterministic 
volatilities in several important cases. 


Stock Options with Stochastic Interest Rates 


Consider a call option on the stock with price S, at time f, strike K, and maturity T. Let 
F.(S,T) = S,/Z,(T) be the forward price for the stock, with delivery at time T. Since 
Sr = F;(S, T), the pay-off for the call option can be written as follows: 


C7 = (Fr (Sp, T)—K),, (2.35) 


(x), =max(x, 0). In the forward measure Q(g) with numeraire g, = Z,(T), the forward price 
F,(S, T) is a martingale. Hence, we suppose that the process for F,(S, T) is given by 


dF ASE) 


Pee o(t)dW,, (2.36) 


where the volatility a(t) of the forward price is a deterministic function of time. Recall from 
Section 1.6 that the transition probability distribution for such a process is lognormal: 
1 


a2 2 19G2 
P(F,, t, Fp, T) = — e losh:/Fr)—o (P—/2)' 20° (F-1) | (2.37) 
i . Fry 2m(T — t) 


where F, = F,(S, T) and o involves the time-averaged square of the lognormal volatility, 


1 
a= 


== | o(u)? du. (2.38) 
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Putting equation (2.37) with equation (2.33) and using the fact that Z,(7) = 1, the pricing 
formula for the value C, of the call option at time f is then given by 











C, _ ,0z,(7)) ae 
zep TE O ES D-K] =f p(F,, t; Fr, T)(Fr— K), dF 
= F,(S,T)N(d,)— KN(d_), (2.39) 
where 
re log(F,(S, T)/K) + 36?(T — t) (2.40) 





a oVvT-t 
and N(-) is the cumulative standard normal distribution function. 
Swaptions 
Consider a payer swaption (or call swaption) struck at rate rg and of maturity T. The 


underlying is the fixed leg with pay-off as present value of all future cash flows if the swap 
rate rp > rg: 


PSO; = T(r —Te)4 2 Zr(T;), (2.41) 
j=l 
where 7 = T;,, — T; is the tenor. As a numeraire, select the present value of a stream of unit 
cash flows occurring at the coupon dates, T, = 7+7,...,7,,=T-+n7, of the fixed leg: 
8 => ZT), t<T. (2.42) 
j=l 


Recalling the expression in equation (2.24) we see that the swap rate rf is a ratio of two 
assets, with denominator corresponding to the numeraire g,. In this case one can easily show 
from the formula in equation (1.137) that rf is a martingale (i.e., has zero drift w? = 0) 
with respect to the pricing measure Q(g,). Assuming that the lognormal volatility of the 
swap rate is a deterministic function of time, we set oa”? = o(t). The transition probability 
distribution function for the swap rate is then a lognormal function p(7’, t, r}, T), similar to 
equation (2.37). Using steps similar to those in the previous section, one obtains the following 
Black-Scholes pricing formula for the swaption price PSO, at time t: 


PSO, 


_ OAT s 
pa Cea [C= rx), 











=r N(d,) —rgN(d_)], (2.43) 
where 
A= log(rr/Tx) + 50°(T —t) l (2.44) 
ovT—-t 


N(-) is the cumulative standard normal distribution function, and g is defined as in equa- 
tion (2.38), with time average taken over the squared lognormal volatility of the swap rate. 
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Caplets 


Consider a caplet struck at fixed interest rate rę, maturing at time T, on a floating rate 
yOUT + T) of tenor T applied to the period [T, T +7] in the future. The floating rate is 
typically the three- or six-month LIBOR. The pay-off of this caplet is given by a capped-rate 
differential compounded in time 7 multiplied by the discount function over that period: 


Cpl, = (VP (T+1) = rk) 1Zr(T +7), (2.45) 
where the simply compounded yield is given by 
yO(T +) =r! (Z (T+ -—1)=fPT,T +7. (2.46) 
Hence in terms of forward rates we have 
Cpl, = (fP (T, T +17) —rg)}Zr(T +7). (2.47) 


In the measure Q(g) with numeraire asset 


g,=Z,(T +7), (2.48) 
the simply compounded forward rate 
1 ZT) 
(7) t 
T,T = — | ————_ - 1 2.49 
f a T (AS ) ea) 


is readily seen to be a martingale. Note that this follows because the forward rate is (besides 
the constant term 77!) a ratio of two assets Z,(T) and Z,(T +7), where the denominator is 
g,- As in the previous examples, the transition probability distribution p(f,, t, fr, T) for the 
forward rate f, =. Or, T +7) can be assumed lognormal and of the form in equation (2.37), 
with lognormal volatility oÍ = a(t) of the forward rate taken as a deterministic function of 
time. Hence, the pricing formula at time t < T for the caplet with value Cpl, is 


Cpl, = 1Z,(T + DEP [ (fr — 1x) 4] 
=Z,(T+ 0 [tf/(L, T+7)N(d,) —TrgN(d_)] 
=[Z,(T) — Z,(T + 7)|N(d,) — tr¢Z,(T + 7)N(d_), (2.50) 


where 





© 4192(7 
PES log(f; (T, TEDA +50°(T D; (2.51) 
ONT-t 








H 


N(-) is the cumulative standard normal distribution function, and g is defined as in equa- 
tion (2.38), with time average taken over the squared lognormal volatility of the forward 
rate. 
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2:23 


Options on Bonds 


Consider a European call option struck at exercise K, of maturity date T, written on a 
coupon-bearing bond. The option pay-off can be written 


BO, = (P;—K),, (2.52) 


where P, is the present value of the bond, 
P, =} c;Z,(T;), (2.53) 
j=1 


with cash flows c,,...,c, at times T, > T,_, >--- > T, > T. Note that the sum in this present 
value involves only cash flows at future times past the maturity of the option. As numeraire 
asset, we choose g, = Z,(T), and we assume a lognormal volatility for the forward price of 
the bond: F, = F,(P, T) = P,/Z,(T). Note that with this choice of numeraire the forward price 
is a zero-drift lognormal process, where we assume the lognormal volatility as a deterministic 
function of time, o” = a(t). Noting also that P, = F;(P,T) = Fp, the resulting pricing 
formula for the call option on the bond is obtained using steps similar to those in the previous 
examples: 


BO, = Z,(T)E? [Fp — K),] 











= Z,(T)[F,(P, T)N(d,) — KN(d_)], (2.54) 
where 
= log(F,(P, 1) /K)+40° (T-t) (2.55) 
ONT -t 


N(-) is the cumulative standard normal distribution function, and g is defined as in equa- 
tion (2.38), with time average taken over the squared lognormal volatility of the bond forward 
price. It is important to note that this model is inaccurate when the lifetime of the bond is 
comparable to the time to maturity, in which case there can be a significant deviation from 
lognormality due to the pull to par effect. 


Futures—Forward Price Spread 


The spread between the futures price F(A, T) and the forward price F,(A, T) of an underlying 
asset A, whose spot price at time t is A,, is given by equation (1.330). This difference was 
demonstrated in Section 1.11 to be zero in the case when interest rates are deterministic 
functions of time or when the asset price process is statistically independent of the short 
rate process. Here the numeraire g, = B, is the money-market account. Let us now compute 
the spread assuming that interest rates are generally stochastic. It suffices to compute the 
expectation 


E21 A,] = EP [F,(A, T)]. (2.56) 
We consider the stochastic differential of the forward price process F,(A, T), 


dF,(A,T) _ 


F(A,T) F(A,T) 
= dt dW,, 2.57 
F(A, T) Mi D Or t ( ) 
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given that the asset price satisfies 
dA, 
A, 





= pê dt+o4 dW, (2.58) 


Using the results in equations (1.324) and (1.325) for the stochastic differential of the quotient 
F,(A, T) = A,Z,(T)~!, we have 


oF AT) ot a2) (2.59) 
and 
har) = ot) (02 _ oA). (2.60) 


Here we have used the fact that, under the risk-neutral measure Q(B), the drift of the asset 
price A, and the bond price (which is also an asset) are equal, and both are given by the short 
rate. As was seen in Chapter 1, this follows as a consequence of the important no-arbitrage 
property, that all assets drift at the instantaneous short rate r, under the risk-neutral measure 
with the money-market account as numeraire. We should emphasize here that the formulas 
throughout this section obviously extend to the case of many base risk factors as well. In such 
cases the drifts and volatilities are vector quantities with components in the base risk factors. 

We now make the simplifying assumption that the volatilities of the asset A, and the bond 
Z,(T) are deterministic functions of time, i.e., 


of =0^(t), of =07 (1). (2.61) 
The forward price volatility or (47) _ oF (t) and drift iors = p(t), for fixed T and given 


asset A, are then also deterministic functions of time as given by equations (2.59) and (2.60). 
This then allows us to obtain a more explicit formula for the futures—forward price spread, 
as follows. 

Under the measure Q(B), the probability density for the forward price attaining a value 
F, (A, T) = F; at time T, given F,(A, T) = F, at time t, has the lognormal form 


1 
OF, 20(T — t) 


with time-averaged time-dependent drift and volatility 





D(F,, t, Fr, T) = eo los Fe /Fr) + G8? /2(T—-HP /26°(7 1) | (2.62) 


=o [ war, = [ O ar (2.63) 


An expression for the futures price, in terms of the forward price, is now readily obtained 
from the integral 


FI(A, T) = EPP [F;(4,T)]= [Fp Fs £ Fr, TdF 
0 
= F(A, T)? (2.64) 
From equations (2.60), (2.63), and (2.64), the futures—forward price spread is therefore 
T 
F*(A, T)—F,(A, T) = F,(A, T) [exp (J (02 (7) — o*())0%"(x)ar) — 1 . (2.65) 
t 


Finally, note that for given T and asset A, equation (2.64) shows that F*(A, T)/F,(A, T) is a 
deterministic function and hence the volatility of the futures and forward price are assumed 
to be the same, 


o™ (t) = of (t) = o4 (t) — 07 (t). (2.66) 
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2.2.6 


Bond Futures Options 


Consider a European call option on a futures contract on a zero-coupon bond Z,;(U), with 
option strike price K and maturity date T, with T < U. Here the underlying asset A is the 
zero-coupon bond whose maturity date is U [i.e., A, = Z,(U) for given bond maturity date 
U, and at the option expiry date A; = Z,(U)]. The futures price at any time t < T is denoted 
by F*(Z,(U), T); hence the pay-off at the option’s expiry time T can be written as follows: 


BO, = (F} (Zr (U), T) — K), = (Fr(Z7(U), T) — K),- (2.67) 


Here we used the property F(A, T) = F(A, T) for any asset A. In order to price this option, 
we will choose as numeraire the zero-coupon bond with maturity T, i.e., g, = Z,(T). In this 
measure the forward price F, = F,(Z,(U), T) = Z,(U)/Z,(T) is a martingale. We now make 
the same assumptions as in the previous section and postulate that the lognormal volatility 
of a zero-coupon bond of given maturity (i.e., for any T and U values) is a deterministic 
function of time t, with values 77) (f) and a7) (t), for maturities T and U, respectively. 
Here, however, we are working in a probability space, with F, having zero drift. Using 
equation (2.64), we have F* = F,e“7—9 for the price F*(Z,(U), T), with 


1 
(T=¢) 





p= [(0%(2)— 0 (a) 0 har (2.68) 


The probability density p(F,, t, Fy, T) for the forward price attaining a value F, at time T, 
given a value F, at time t, is given by the lognormal form as in equation (2.37) with zero 
drift coefficient. 

The pricing formula for the call on the bond futures contract then follows from similar 
steps as in the previous subsections: 


BO, = Z,(T) EPO [(F}(Z;(U), T) — K),] 
= ZT) EPO [(F;(Z:(U), T) - K),] 
= Z,(T)[F,(Z,(U), T)N(d,)— KN(d_)] 








= Z,(T) [e P0 F} (Z, (U), T)N(d,) — KN(d_)]. (2.69) 
where 
log EWT 4 152(7 =t lo F(Z AU) SEY +(—p+ie)(T-t 
E 10 (T-t) _ log +k (—p+ 30°)(T — 1) (2.70) 








E oVT-t oVvT-t 


and N(-) is the cumulative standard normal distribution function. Here pz is given by equa- 
tion (2.68), whereas 





poe E (a2) (7) — o2)(7))” dt. (2.71) 


The option price can therefore be expressed either in terms of the futures or forward price, 
as well as the zero-coupon bond volatility for the two maturities T and U. 
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Problems 


Problem 1. Demonstrate a put-call parity relation for European options on a futures contract 
on an underlying zero-coupon bond as described in Section 2.2.6. 


Problem 2. Derive an option-pricing formula similar to that in Section 2.2.6 for a forward 
contract on a bond. 


Problem 3. Derive a Black-Scholes formula for a European bond put option struck at 
exercise K, of maturity T. Is put-call parity satisfied with respect to the call price given in 
Section 2.2.4? 


Problem 4. A floorlet is similar to a caplet, except the floating rate is bounded from below 
with payoff (rz — yrs T)),7Z7(T +7). Derive a Black-Scholes formula for a floorlet. Is 
there a relationship between a floorlet and a caplet? 


Problem 5. Caps and floors are collections of caplets and floorlets, respectively, applied to 
periods [7;, 7; +7], j= 1, . . . , n. Show that a model-independent relationship cap = floor + 
swap exists. 


Problem 6. Provide a Black-Scholes type of formula for a receiver swaption with payoff 
T(rg —Tr)4 Vi) Zr(T;). 


Problem 7. Provide a Black-Scholes type of formula for a European call option with maturity 
T and strike K and written on a (unit-nominal) zero-coupon bond with maturity S > T. 
Denoting its pricing function by ZBC,(T, S, K), 


ZBC,(T, S, K) = EP [ef “*(Z(S) — K) 4]. 
Assume the forward price of the bond F,(Z(S), T) = Z,(S)/Z,(T) follows a zero-drift log- 


normal process with time-dependent volatility o(t) under the T-forward measure Q(Z(T)) 
with Z,(T) as numeraire asset price. 


2.3 One-Factor Models for the Short Rate 


2.3.1 


Bond-Pricing Equation 


A possible way of specifying an interest rate process is to assign a stochastic differential 
equation for the short rate 


dr, = u$ (r,, thdt+a(r,, t)dW,. (2.72) 


Here g is the numeraire asset. The functions u£ and ø give the drift and volatility, respectively, 
of the short rate r, under the measure, with g as numeraire asset. Here we note that the drift 
and volatility functions in general have an explicit dependence on r and tf variables. 


Theorem. (Bond-Pricing Equation) Jf the short rate process described by equation (2.72) 
is Markovian, then the zero-coupon bond price process Z,(T) is given by a pricing function 
Z(r, t, T) so that 


Z(T) =Z(r,,t, T). (2.73) 
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¢ The function Z(r, t, T) solves the following partial differential equation: 


OZ ðZ olr, tl PZ 
r= + 





=rZ. 2.74 
a ar 2 or Caa 

¢ The drift of the short term rate is given by 
be (r,t) =r+oa%(r, tots, t), (2.75) 


where of = 08(r, t), as a function of r and t, denotes the volatility function for the 
numeraire asset price g, at calendar time t. 

e Under the risk-neutral measure with choice of numeraire asset as the savings (i.e., 
money-market) account process, g, = B, = elo™=45, the discount function at present 
time t, maturing at time T, is given by the conditional expectation under the risk- 
neutral measure 


Z, (T) = ELO [eS i] = BLOM 48 |p, = r], (2.76) 


i.e., with condition r, =r. 
e The probability density P(r, t) for the short rate having value r at time t, given an 
initial condition for the density P(r, 0) at time t = Q, satisfies the equation 


TOD i oered) (uenon) em 





Proof. The representation in equation (2.73) is due to the Markov assumption for the short 
rate: In this situation, the price of a zero-coupon bond can only depend on the short rate value 
r at calendar time ¢t and on calendar time f, given a maturity T. By using It6’s lemma, where 
Z is considered explicitly as a function of r = r, and ¢ variables, one obtains the stochastic 
differential for Z = Z(r,, t, T): 


OZ IZ (r,t) PZ 
az=(% + pF (r, 2 + en. 


= pw" Zdt + bees (2.78) 





OZ 
zjar o(r, t)dW, 
or 


with bond volatility o4 = 07 (r, f) as 


OZ(r, t, T) 


Zo?) (r, t) = a(r, t) F 
z 


(2.79) 


Here the bond volatility function is denoted explicitly as a function of r and t, for given 
maturity T. The Black-Scholes equation for the stochastic differential equation (2.78) gives 
the pricing function for bonds as satisfying 


a(r, t}? PZ 
2 are 





OF ite nin’ =rZ+qfo(r, D, (2.80) 
ôt or 

with price of risk qf = øf. Note that this also follows by taking expectations on both sides 
of equation (2.78) while using E2”[dZ] = (r+ qfo2™)Z dt, E2[dW,] = 0. This is 
essentially a special case of the Feynman—Kac result. In the special case where g, = B,, i.e., the 
money-market account, the price of risk is zero (hence also giving uë = r) and we finally find 


2.3.2 
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equation (2.74). Since the pricing function Z(r, t, T) is not dependent on the price of risk, we 
conlude that the drift 4% of the short rate process satisfies equation (2.75). By applying Itô’s 
lemma to the expectation in equation (2.76), one can show that Z satisfies partial differential 
equation (2.74) with condition r = r,, thus verifying formula (2.76). Another simple proof of 
the bond-pricing equation is to apply the Feynman—Kac formula to the conditional expectation 
in equation (2.76), which can be written as BEL” [B;'], where this last expectation satisfies 
a Feynman—Kac PDE. Finally, equation (2.77) follows from the Fokker—Planck equation (or 
Kolmogorov forward equation) for the probability density corresponding to the process in 
equation (2.72). O 


Hull-White, Ho—Lee, and Vasicek Models 


There is empirical evidence that the interest rate process in the real-world measure is mean 
reverting. The series for the five-year U.S. dollar rate in Figure 2.7 shows this phenomenon 
visually. The periods with high and low rates alternate, following the expansion and recession 
cycles of the economy. There is also strong evidence from option prices that the risk-neutral 
process is mean reverting as well. Notice that this conclusion is not obvious, mathematically, 
since the price of risk can in principle offset the mean-reverting character of the overall 
process. Nevertheless, market expectations as they are reflected through cap prices, for 
instance, reveal that the market expects rates to fluctuate not far from the historical mean 
on long time scales. A large class of stochastic models with the mean-reversion property 
can be constructed based on two processes: the Ornstein—Uhlenbeck process and the Cox- 
Ingersol-Ross process. We construct both models emphasizing both the continuous-time 
interpretation and the discrete-time recurrence relations they satisfy. This approach has the 
advantage of clarifying the methodology for statistical estimations using daily or weekly data 
and to generate Monte Carlo simulations. 

In what follows we describe an explicit method of obtaining expectations of stochastic 
quantities, as well as the discount function, by the use of a discrete stochastic calculus 
approach combined with a subsequent continuous-time limit. Let us first consider the time 
interval [0, t] and its discrete subdivision, with the points T = {tọ =0,f,,...¢, = t} making 
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FIGURE 2.7 A time series for the 5-year U.S. dollar (USD) rate. 
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up n subintervals of length ôt, = t,,, — t; Subinterval paths are defined by means of the 
recurrence relations 


lig = eH) Bri ior a(t;)ôt; + a(t;)OW,,, (2.81) 
for all i=0,...,N—1, where ôW, are uncorrelated Brownian increments such that 
E,,[5W,, 5W,,] = ô; ôt. (2.82) 


The solution to these recurrence relations is readily found by iteration, giving 


n-1 
ae Lito b(t) 8; n+ e Dist Dlt (a(t;) 61; + o(t;)W,,). (2.83) 


i=0 


In the continuous-time limit, as the partition of the interval [0, t] becomes finer and finer, 
i.e., in the limit ôt; > O (or n —> oo), this expression for the stochastic process is given by 
the stochastic integral: 


t t t 
= neh os f eh bw)" (a(s)ds +o(s)dW,). (2.84) 
0 


Notice that this expression reduces to equation (2.83) if the functions a(t), b(t), and o(t) are 
piecewise constant in the intervals [t,, t; + 6t;). Differentiating this expression with respect to t 


while using Leibniz’s rule for the derivative of the integral on the right gives the stochastic 
differential equation satisfied by r, as 


dr, = (a(t) —b(t)r,)dt + o(t)dW,. (2.85) 


This model encapsulates both the Hull-White and Vasicek models [HW93, Vas77]. The 
Hull-White model obtains by setting b(t) = b, o(t) = ø as constants and keeping a(t) as 
time dependent. The Vasicek model obtains by also setting a(t) = a as constant. The Ho- 
Lee model corresponds to setting b(t) = 0, o(t) =o as constants and a(t) as generally 
time dependent. The Black—Karasinski model obtains by replacing the short rate r, with the 
logarithm log r, in equation (2.85). 

From the solution in equation (2.83) one can obtain the expectation, at time t = 0, of the 
random variable r, by making use of E,[6W,,] = 0 and then taking the continuous-time limit 
of the sums, giving 


t g d 
Eor] = e. der f ef ana(s) ds. (2.86) 
0 


The reader will also note that this is consistent with taking expectations on both sides of 
equation (2.84) and using the property of zero expectation for the stochastic integral part, as 
discussed in Section 1.4. Similarly, the variance can be obtained by considering the following 
expectation in the continuous-time limit: 


Baler, Ear, D= E| (PHM ote DEW, ) | 


i=0 


n—1n-1 
= D hat HEE MWh Gt, )or(t,)Eg[SW,,5W,,] 
i j 


i=0 j=0 


n—-1 t 
= D euiait! b(t) Ot, alty —> f eds bw)du os)" ds, (2.87) 
0 


i=0 
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where the last expression is obtained in the limit n — oo. The reader will note that equation 
(2.87) follows also by equations (2.84) and (2.86) after applying the Property (1.106). 

In this model for the short rate process we have the useful result that the variable defined 
by the integral X7 = i r, ds, for any time interval [t, T], is a normal random variable. Hence, 
the discount function can be obtained in terms of the mean and standard deviation of the 
random variable X7, as is shown next. To compute the mean and standard deviation of X7, 
consider now the interval [t, T] with n subdivisions within time points tọ =1,t,,...,¢, =T 
and, as before, ôt; = t,,, — t;. The discretized form of the integral is 


n—-1 
xX} = NS T, Oty: (2.88) 
k=0 


Taking the expectation, at time f, of this sum while using equation (2.83) for r, and 
E,[6W,,] = 0 gives 


n—1k-1 
E [X1] =r, 2 Di HD Sp 4 Y Y e Ei 4 aCe Sty. (2.89) 


k=0 i=0 


In the continuous-time limit we have the mean 
T i T y i 
XT = E [X7] = r | gde eu ds+ | / a(uje hO du ds 
t t t 
=rn(t,T)+m)(t, T), (2.90) 


where the functions n(t, T), mo(t, T) have been defined through the integrals. The reader can 
also verify that this result obtains by applying Property (1.105) together with (2.84), after a 
time shift. The variance follows from the expectation: 


n—1k— 


var[X7] = E,[(X? — E,[X? y ]= | Eee fate Peen o(t,) OW, on) | 


k=0 i=0 


n—1k—1 n-1k -1 


25 5 ye -Diha bUan- Fh, G8 a(t)o(ty dt, Sty E,[8W, W, ] 


k=0 i=0 x’ =0 i’ =0 








pon n—1 k-1 k —1 


ZER LL Je Dict MOM EL MM o1)? Oty t 


k=0 i=0 fap  k=0 Ķ =9 i=0 








> |f asf arf aus f asf adu farlo eh odf Dd (291) 


where the last expression obtains in the limit n —> oo. By reversing the order of integration 
in these integrals one can write the expression as one integral term, giving: 


var[X/] = f i (f editou) dT = [ a(t) n(t, T) dt 
=m,(t, T). (2.92) 


Having obtained x T and var[X/], we therefore have the probability density for the normal 
random variable X? ~ N(X7, var[X1]) taking on a value y, as viewed at time #, given by a 
Gaussian: 





Ome) } (2.93) 


pO) = 2T var[X7] exp ( 2 var[ X7] 
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The discount function Z,(T) = Z(r,, t, T) is finally obtained in terms of the expectation 
ZNSE le" |S | poe? aya Sen, (3.84) 
where 
1 
m(t, T) = z™ (t, T)— m(t, T). (2.95) 
Note that this discount function can also be derived by using the method discussed in the 
next section. There, the solution for Z(r,, t, 7) in the form of an exponential of an affine 
function in r =r, [see equation (2.94) or (2.116)], is obtained by simply plugging the 


expression into bond-pricing equation, where the volatility function is independent of the short 
rate. The functions m(t, T) and n(t, T) are readily shown to satisfy a system of first-order 


equations, 
on 
—=bn—-1, 2.96 
yon (2.96) 
and 
ôm 1 Da 
a —o(t =0, 2.97 
= ant zoli) (2.97) 


with final time conditions m(T, T) = n(T, T) = 0. For these models, this system is exactly 
integrable, giving the same integral expressions as before. 

For purposes of yield curve fitting, it is of interest to consider the formulas for the discount 
function in terms of the zero-coupon yields. In particular, the foregoing solution reads 


y(T) = (T —t)'(n(t, T)r, — m(t, T)). (2.98) 


The interpretation of this equation is that, for one-factor models having discount functions 
as exponentials of affine functions of the short rate, the shocks due to changes in the short 
rate are the only ones to affect the shape of the yield curve, which moves parallel to itself, 
according to equation (2.98). 

The function n(t, T) is linked to the term structure of volatility at calendar time t. In fact, 
by taking the stochastic differential of y,(T) in equation (2.98) while using equation (2.85), 
the yield is shown to have volatility 


t, T)o(t 
oi = METI) 


2.99 
Tor (2.99) 
The variance of the differential of the yield hence has a quadratic form given by 
t,T)or(t 
var(dy,(T)) = (of)? dr = "CDT O (2.100) 


(T-A 


The foregoing yield volatility equation allows one to fit the function b(t) in terms of the 
current term structure of volatility. Indeed, since 


T S 
n(t, T) = J eT fr Bòde dg, (2.101) 
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by differentiating with respect to T we have 


b(T) = = (tos at, n). (2.102) 


Note that we can rewrite this equation by changing variable names, letting maturity T > t 
and present time t —> tọ, giving 


b(t) = 2 (tog Zat). (2.103) 


Given the fitted function b(t), one can then fit (or retrieve) the function a(t) from the 
discount function or using equation (2.98). Moreover, for the case of the Vasicek, Hull—White, 
and Ho-Lee models, all of the preceding integral expressions are readily worked out exactly 
in terms of exponential functions. Let us specifically work out the formulas for the case of 
the Hull-White model. Since b, ø are constants, equation (2.101) is integrated to give 


1 
n(t, T) = z4 — e™T-)), (2.104) 
And for m(t, T) we have 
1 pT o2 T 
m(t, T)=- / [ePP-9) _ 1Ja()dr + — i, (1— e-d}? dr. (2.105) 
b Ji 2b? J; 
Taking logarithms of equation (2.94) gives 
logZ,(T) = m(t, T) —n(t, T)r,. (2.106) 


Differentiating this equation with respect to T while using equations (2.105) and (2.104) gives 





ô o? 4 
T log Z,(T) = 35 [i Qe AT g PME d — re E. a(rje™-? dr. (2.107) 
Differentiating again while using equation (2.107) then gives 


a(t) =- logZ(T)- bZ iogz (T) +Z (1- eT), (2.108) 


Changing the variable name T to ¢ and taking the initial time as zero gives 
4 e7?) 
a(t) =— © log 2, (t)— bz 7 log Zo (t)+ ae ). (2.109) 
This last equation gives us a useful relationship between the drift function and the zero-coupon 
bond prices, as a function of the maturity. In particular, one can rewrite this in terms of the 


instantaneous continuously compounded forward rates [these are defined in a later section; 
see equation (2.153)]: 


a(t) =~ ful + bfo()+ EC e), (2.110) 
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Lastly, notice that the option-pricing formulas in the previous section, obtained under the 
forward measure, can be applied as the log-normal volatility of a zero-coupon bond forward 
given by 


AE (T, T)\ s ; 
(Say) =O (n(t,T')—n(t, TY dt. (2.111) 


The pricing formulas in the previous section, however, require the bond-forward measure. 
Examples are the formulas for swaptions and options on coupon bonds, which are not 
applicable here because the resulting volatility is not a deterministic function of time. 

The foregoing short-rate models are among the popular models used for pricing interest 
rate options. In particular, lattice methods are useful for calibration and pricing. For an actual 
implementation of binomial and trinomial lattice trees within the Ho—Lee, Black-Derman— 
Toy, Hull—White, and Black—Karasinski models, the reader is referred to the project on 
interest rate trees in Part II of this book. The project contains an elaborate discussion of the 
various implementation steps for calibrating binomial and trinomial short-rate lattices, and 
for numerically pricing interest rate derivatives within these four models. 


Cox—Ingersoll—Ross Model 


The stationary Cox—Ingersoll—Ross (CIR) model for the short-rate process is generally defined 
as follows under the risk-neutral measure: 


dr, = (a—br,)dt+o./r, dW,. (2.112) 
According to the foregoing theorem, the bond-pricing PDE for this process is: 


ep ee EE (2.113) 
a Page er ee ; 


OZ 
ot 





The stochastic differential equation satisfied by Z = Z(r, t, T), where r = r,, is 
dZ=rZ dt+o7,Z dW,, (2.114) 


where 


_ ar OZ 


Ov, = 
f Z or 





(2.115) 


Note that the CIR model is sometimes written so that the risk-neutral drift term has the 
form K(0— r), where the constants k and 0 correspond to the rate of reversion and mean 
level, respectively. In our convention, this simply corresponds to setting 0 = a/b and k = b. 

As with the Vasicek model in the previous section, the discount function for the CIR 
model takes the form of an exponential of an affine function in r: 


Z(r, t, T) = exp (m(t, T)—n(t, T)r). (2.116) 
Direct substitution leads to the equations 


ðn 1 
-5P ont 1 =0 (2.117) 
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and 


ð 
= =an. (2.118) 
The final-time condition Z(r, T, T) = 1 gives m(T, T) = n(T, T) = 0. Note the difference 
between these equations and equations (2.96) and (2.97), obtained for the models considered 
in the previous section. Again, exact expressions for m(t, T) and n(t, T) are readily obtained 
by integrating equation (2.117) and subsequently equation (2.118), giving 


2 br 
m(t, T) = £ log 2 (2.119) 
o? ycosh yT + 5b sinh yT 





and 


inh 
n(t, T) = sinh yt 





T ; (2.120) 
ycosh yT + 5bsinh yT 


where 7 = T — t is the time to maturity and y = iV b? +207. 
The Fokker—Planck equation for the risk-neutral probability density of the spot rate is 


Op(r, t 1 ð 
BND ie (ab p Ci A (2.121) 
ot 2 or? 
In the long time limit t — oo the distribution approaches a steady state with dp/dt > 0. As 
one can verify by direct substitution into the right-hand side of equation (2.121), the stationary 
probability distribution, denoted by p„ (r), is 


(2bj h 


KAO) 2a/o)-1 ,—(2b/0)r b>0 2.122 
Tajo) ' a i i 


Pœ(r) = 


where T(-) is the gamma function. Notice that when a > 07/2, p (r) > 0 as r > 0, i.e., 
gives zero probability of attaining zero interest rates. Otherwise, the stationary probability 
distribution diverges in the limit r > 0 when a < 07/2. 

In particular, the distribution integrates to unity for a > 0, has an integrable singularity at 
r =Q for values 0 < a < g?/2, and is nonintegrable for a < 0. For a e (0, 07/2] the origin 
is reflective. These same conclusions also apply to the time dependent density just below. 

An exact analytical solution of the time-dependent Fokker—Planck equation (2.121) for 
the distribution function p(r, t) = p(r, rọ; t), subject to the initial-time condition p(r, t = 0) = 
ô(r — ro), can be shown to take the form (a, b > 0) 


bt\ 4/2 
P(r, rod = (=) exp (— c (me + r)) L (2c, (rore ™)"?) i (2.123) 
ro 


where c, = 2b/(0°(1 — e™)), q = (2a/o?°)— 1, and J,(-) is the modified Bessel function 
of the first kind of order q. Useful properties of the Bessel functions are contained in 
Appendix C of Chapter 3. Further properties of this density are given as problems at the end 
of this section. By using the series expansion of the modified Bessel function, the distribution 
function in equation (2.123) can be shown to be related to the noncentral chi-squared function 
f(x, v, A), since 


fel, v, A) = i “PP EDRI, (VK), (2.124) 
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FIGURE 2.8 Plots of the CIR risk-neutral transition probability density as a function of the short rate, 
at three different chosen times. 


where v and A are the number of degrees of freedom and the noncentrality parameter, 
respectively. In particular, for the CIR model under the risk-neutral measure, the spot rate 
r =r, at time ż is a random variable generated by 


o?(1—e-") 
LS E 


; 2.125 
; a? ( ) 


where p is a noncentral chi-squared random variable with 2(q + 1) = 4a/o? degrees of 
freedom and time-dependent noncentrality parameter equal to 2c,r9e~™. Figure 2.8 gives a 
plot of the foregoing risk-neutral density for different time values t = 0.25, 0.5, and 1.5 and 
with choice of parameters a = 0.075, b = 0.35, øo = 0.15, rọ = 0.065 (all units are on a yearly 
basis). With this choice of parameters, the steady-state distribution is nearly attained at values 
of t ~ 20. 

Under the forward measure with numeraire Z,(T), the equation for Z = Z,(T) = 
Z(r,, t, T) is 


dZ =(r,+03)Z dt+o,Z dw’, (2.126) 


where dW,’ is the Brownian increment under that measure. Assuming that under the forward 
measure the short rate evolves as 


dr, = w(r,)dt+o./7, AWF, (2.127) 


this implies, due to It6’s lemma, and from equation (2.113), 





Bie ee ey er | eee een de HS) 
Been agg Usp eC E Ae aggre eee 
where r, = r. Hence, the drift obtains as 
aZ\' or AZ 
—a+br) = 03Z| — =——, 2.129 
(u)-at6y=032(Z) = SS (2.129) 
giving 
r OZ 
dr, = (a—on+ = =) Z dt+oVJr, dW’. (2.130) 
r, 
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Under this forward measure, one can also solve the Fokker—Planck equation for the process 
defined by the corresponding stochastic differential equation, giving a slightly more alge- 
braically involved analytical expression for the density, yet again in terms of the modified 
Bessel function. This follows from the fact that Z = Z(r, t, T) = e”6D-"6Dr so (Z/dr)/Z = 
—n(t, T) (independent of r). Hence the foregoing SDE has the same structure as the original 
SDE for the CIR process in the risk-neutral measure, except for an additional time dependence 
introduced into the mean-reversion coefficient. The solution follows by applying appropriate 
transformations.! In particular, it can be shown that a random variable for the short rate r = r, 
at any intermediate time T with 0 < t< T < T has the form 


Pat, 7) n(1,7)) - 
z= —“gan(s,T)/ar ” 





(2.131) 
where p is a noncentral chi-squared random variable with 4a/a? degrees of freedom and 
noncentrality parameter given by 


dan(t,T)/at 
o2(n(t, T)—n(t, T)) " 





(2.132) 


Note that a simplification arises with the choice of time parameters t = 0 and 7 = T. We 
refer to the literature on the CIR model [CIR85] for a derivation of these results. The 
more advanced material in Chapter 3 that deals with Green’s function methods for the 
Fokker—Planck equation actually provides the reader with the mathematical tools for deriving 
analytically exact transition probability densities for the short-rate process within the CIR 
and other models from first principles. Such transition densities, or formulas of the type just 
given, allow one to price most European-style interest rate derivatives and to generate exact 
scenarios for the short rate under the CIR model. 


Problems 


Problem 1. Show that the transition probability density function p(r, rọ; t) in equation (2.123) 
satisfies the Fokker—Planck equation (2.121), with initial condition p(r, ro; t = 0) = ô(r — rọ). 
Hint: After inserting the solution into the Fokker—Planck equation, differentiating and col- 
lecting terms, arrive at a second-order ordinary differential equation for the modified Bessel 
functions; i.e., show that this gives the modified Bessel equation of the form (see Appendix C 
in Chapter 3) 


2 


dx? 


2 
2 


O+ EO- (14 SL) =0, 








where v is the order. 


Problem 2. Verify that the CIR density in equation (2.123) where a, b > 0 gives 


f pr, rn dr = 1, (2.133) 
0 


Let P(r, ro; t) be the transition density for the process dr, = (a — b(t)r,)dt + o /T;dW,, with deterministic time 
dependent coefficient b(t) and define the respective scale and time changes: A(t) = elo du and t(t)= = 5 A(u)du. 
Then P(r, ro; t) = A(t)u(A(O)r, ro; T(t)), where u is the density for the Bessel process as given in equation (3.215) 
with Bessel order u = (2a/o07) — 1. Note: when b(t) = b is constant this corresponds to the density in equation 
(2.123). 
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hence demonstrating that short rates are never negative, i.e., that any short-rate path starting 
at time t = 0 at any finite positive value rọ will end up in the positive axis with probability 1 
at any finite later time t > 0. Hint: Use the Bessel integral property (3.357) in Appendix C 
of Chapter 3. 


Problem 3. Show that the CIR density in equation (2.123) satisfies the Chapman- 
Kolmogorov equation 


f p(rr, r, T — t)p(r, rọ, thdr = p('r, ro, T). (2.134) 
0 


Hint: Use an appropriate Bessel integral property from Appendix C of Chapter 3. 


Problem 4. The integrated form of equation (2.112) from time s to time t gives 


t t 
aes f (a—br,)dt + f E/F, dW,. (2.135) 


(a) Show that E,[r,] = E[r, |r,- = r,] satisfies a first-order ODE in time t > s, with initial 
condition E,[r,] = r, at t= s. Solve the initial-value problem and thereby obtain an 
exact expression for the conditional mean E,[r,]. 

(b) Obtain an exact expression for the conditional variance Var(r,|r,., = r,) = E,[(r,)7] — 


(E[r D. 


Problem 5. Assume the short rate satisfies SDE (2.85). 


(a) Find an expression for the auto-correlation function 


Corr(r,, r,) = Cov(r,, r,)/v Var(r,)Var(r,) 


for s< t. 

(b) Find an exact closed-form expression for Corr(r,, r,) by considering b(t) = b, o(t) =o 
as constants. Explain your answer in terms of the mean-reversion parameter and what 
it represents in the limit b > 0. 


Problem 6. Consider the European call option on a zero-coupon bond as stated in Problem 7 
at the end of Section 2.2. Find a closed-form analytical expression for this option price 
ZBC,(T, S, K) in: 


(a) The Hull-White model with constant mean-reversion coefficient b and constant 
volatility o 
(b) The CIR model described in this last section 


Hint: For part (b) choose g, = Z,(T) as numeraire (i.e., use the T-forward measure for taking 
expectations) and use the formulas at the end of this section. In particular, use the appropriate 
transition density for the short rate (within the 7-forward measure), and obtain your final 
result as a sum of two terms involving the cumulative chi-squared density. 


2.3.4 
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Flesaker-Hughston Model 


The Flesaker-Hughston (FH) model is based on the original idea of defining a numeraire 

asset process without a direct financial meaning. Interest in this model stems from the fact 

that it is possible to derive analytical closed-form solutions for both caps and swaptions. 
The numeraire process in FH models is defined as follows: 


1 
FAO 
where f(t) and g(t) are deterministic and strictly decreasing positive functions of calendar 


time ż, and x, is a positive definite martingale. A zero-drift geometric Brownian motion gives 
a possible definition of x,, i.e., 


g (2.136) 


dx, = a(t)x, dW,, (2.137) 


with some chosen initial condition xọ = 1. Notice that in this model, log x, follows a simple 
Wiener process with drift —(o(t))?/2 and diffusion o(t). An alternative definition of x, is the 
variance-gamma process. Within the FH model one readily arrives at an arbitrage-free price 
at time ¢ of a zero-coupon bond of unit worth at maturity time T as 


1 | _ £T) +8(7)x, 
8r f(t) + g(t)x, 


Here we have used the martingale condition E,[x,;] = x,. The instantaneous short rate also 
has a simple expression since f,(t) = r, as discussed in Section 2.4; hence, 





ZAT) = g,E2® | (2.138) 


__ f'@+eOx, 


r EELA Z (T) 
oe BINA u OFO 


oT 





(2.139) 


Simply compounded (time-f) forward LIBOR rates L,(T) with settlement date T, tenor 7, 
and given compounding period 7 solve the equation 


ZT) _ f(T) + 8(T)x, 








IHD) = 7 ray ~ AT +1) 4e(T +a, es) 
and are thus given by 
1l S(T) +8(T)x, 
a a ea Aa 


Using g, as numeraire and following the pricing methodology as in the worked-out examples 
of Section 2.2, a caplet struck at rate k and maturity T is hence priced as follows: 





Cpl, (x, T) = g,E2® a T)(Lr(T +1) — 2) 


ET 
= g,EP® [(ag(k, T) + by(k, T)xr),], 


where 


ao(k, T) = f(T) —A + xr) f(T +7), b(k, T) = 9(T)-—C+x7)g(T+7). (2.142) 
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By using the lognormal probability density function for xr, which results from the process in 
equation (2.137), this expectation integral gives rise to an exact pricing formula: 


Cpl, (k, T) = g,[ao(k, T)N(h? (t, T, K)) + bo (kK, T)x,M(AY A(t, T, K))], (2.143) 


where 





log (- ate + 16°(T — t) 
h(t, T, K) = ea - ? ; (2.144) 
g —t 








the time-averaged volatility is 
= f o} d (2.145) 
oo = ——— o(u u, . 
(T = t) t 


and N(-) is the cumulative standard normal distribution function. This formula is valid 
for cases in which bọ(k, T)/aọ(k, T) < 0. Deriving a similar pricing formula for the case 
b/a > 0 is left as an exercise for the reader. 

A payer’s swaption was considered in Section 2.2.2, with payoff 


PSO; = T(r} — K), >> Zr(T,), (2.146) 
j=l 


where the swap rate r’ at time ¢ and the strike rate x are in units of an interest rate (i.e., 
time™!). Assuming n payments and a swap rate of the form 


_ 1-2Z,(7,) 
<a): 
we can write the price of a payer’s swaption maturing at time T as 
1—Z,(T,) — KT pare Zr(T;)), 
ET l 
= ¢, Er [(a (K, T) +b, (K, T)xz), ]. 


In the last equation we have used the identity [see equation (2.138)] 


Zr (TAT) + 9(T) x7] = f(T;)+8(T,)xr, (2.148) 


(2.147) 





PSO,(x, T) = TOR 


giving 


a, (kK, T) = f(T) — fT) —«r Di f(T), b,(k, T) = g(T)— 8(T,) — Kt) 8(T;). (2.149) 


i j=l 


As before, by using the lognormal probability density function for xy, the expectation integral 
gives rise to an exact pricing formula: 


PSO,(«, T) = g,[a,(«, T)N(h” (t, T, K)) +b, (K, T)x, MAY (t, T, K))}, (2.150) 


where 





2 log (- ssi) T to’ (T = t) 
h? (t, T, K) = r i (2.151) 








the time-averaged volatility is given by equation (2.145), and N(-) is the cumulative standard 
normal distribution function. This formula is valid for cases in which b, (k, T)/a„(kK, T) < 0. 
Deriving a similar pricing formula for the case b,„/a„ > 0 is left as an exercise. 
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2.4 Multifactor Models 


Multifactor models make use of the observed-yield curve, and this in turn can be described 
either as a collection of zero-coupon bonds (i.e., discount bonds) of various maturities T 
with respect to an arbitrary calendar time t with price Z,(T) or by the instantaneous forward 
rates. In what follows we denote present (today’s) calendar time as t = 0, whereas time t > 0 
generally stands for any time in the future or today. It is useful at this point to review very 
briefly the connection between these quantities and their relation to the instantaneous short 
rate. Let us recall the continuously compounded time-t forward rate for a future finite time 
interval [T, T +7] as given by 


log Z,(T + 7) — log Z,(T) 





FAT, T +7) = (2.152) 
T 
In the limit T — 0 this defines the instantaneous forward rate f,(T) as 
0 
S(T) = —— logZ,(T). (2.153) 
oT 
Hence, forward rates and discount bond prices are also linked by 
T 
ZT) = exp ( z / fis)ds). (2.154) 
t 


This simple expression can be directly contrasted to that of the discount bond price given in 
terms of the risk-neutral expectation involving the instantaneous short rate r,, 


Z (T) = ELP [en e], (2.155) 


The bond price is therefore related to a path-integral of the stochastic variable r, rather than 
to a simple (nonstochastic) integral as in the case of the forward rates. This path-integral 
expectation shows that if the short rate is stochastic, then f,(T) Æ rr (t < T), whereas when 
r, is deterministic the expectation is simply a regular integral and we have f,(T) = ry for 
all t < T. In the HJM treatment described shortly, one is directly modeling the forward rates 
as local stochastic (i.e., Markov) processes. In view of the path-integral relationship between 
the short rate and the forward rates, one anticipates a generally non-Markovian theory for 
the short rate. A simple result of the formulation is that for generally stochastic short-rate 
processes we have 


fi) =r, (2.156) 


This obtains by equating the right-hand sides of equations (2.154) and (2.155), with T =t+e 
(e > 0), and differentiating with respect to €, giving 


ELP) [e Ji reds Tirel = f(t $ €) eT Tai Sils)ds (2. 157) 


Taking the limit € > 0 gives equation (2.156), since E,[r,] = r, i.e., the value of the instan- 
taneous short rate at time t. 
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2.4.1 


Heath—Jarrow—Morton with No-Arbitrage Constraints 


An arbitrage-free dynamics of the yield curve in a diffusion model must satisfy constraints 
that take up various forms, depending on the modeling framework. In this section, we 
review the Heath-Jarrow—Morton (HJM) constraint for models of instantaneous forward rates 
[HJM92]. In the next section, we discuss the Brace—Gatarek—Musiela—Jamshidian (BGMJ) 
condition, where one models LIBOR rates instead. We present formulas in the context of one 
independent risk factor; however, the multifactor extension follows in an obvious manner, 
and we leave the derivation as an exercise problem. 
Consider an interest rate stochastic process specified through the short rate 





dr, = pi (r,, thdt+oa'(r,, jdW?, (2.158) 


in a suitable measure Q(g). When working within the risk-neutral measure, recall that all 
assets drift at the instantaneous short rate r,. In particular, all discount bonds of any maturity 
T are assets, and hence 


dZ,(T) = 1,Z,(T)dt +07 Z,(T)dWs, (2.159) 


under the risk-neutral measure with numeraire g, = B, and dW, as Brownian increment in 
Q(B). Notice that if one chooses a numeraire other than the money-market account B,, then, 
in accordance with the asset pricing theorem in Chapter 1, the drift for any asset (including 
any discount bond) will have an extra term added to r, to account for the price of risk. We 
use shorthand notation to denote of D= o(t, T, Z,(T)) as the time-t volatility of the bond 
price. It is important to observe that in general, the bond price volatility is allowed to be a 
function of calendar time t, maturity time T, and the (stochastic) bond price Z,(T) at time t. 

Thanks to It6’s lemma, the logarithm of the discount function obeys the following stochas- 
tic differential equation: 


1 
d[log Z,(T)] =[r,— ze Jatt of aW,. (2.160) 


Since this equation applies for any value of T, we can use it for maturity T and T +7. 
Combining this with equation (2.152) gives the stochastic differential of the rate f,(T, T + 7): 


(of) = (fy a ott) a ofn 


d| f(T, T+7)] = 





dW,. (2.161) 
2T T 
The stochastic differential of the instantaneous forward rate in the risk-neutral measure now 
obtains in the limit 7 — 0: 


df (T) = 0/0" dt—o/ aw,, 
=p dt+0of” aw,. (2.162) 


The last equation defines the drift uf” = u’ (t, T, f,(T)) and volatility of” = of (t, T, f,(T)) 
of the instantaneous forward rate. The superscript prime is used to denote differentiation with 
respect to T, i.e., of) = 007 (7) /OT. It turns out that one can relate the drift with the volatility 
of f,(T), since a simple integration of the bond price volatility derivative, with respect to 


maturity time, gives 


T 
f 02 dr = ol”. (2.163) 
t 
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In this equation we have used the fact that a Ox 0, which says that the bond price has zero 
volatility with known unit value when t = T. Then, using the earlier relations for the drift 
and volatility of f,(T) in terms of the bond price volatility, we arrive at 


T 
Se] of ae (2.164) 
t 
This result shows that the drift of f,(7) is linked to its volatility and the volatilities of all 
forward rates f,(7) between times T = t and T = T. The link between the drift and volatility 
of the instantaneous forward rate was first noted by Heath, Jarrow, and Morton. 
From this treatment one can arrive at the risk-neutral process for the short rate stated in 
equation (2.158). Using equation (2.162) rewritten in the form 
df,(t) = 02%09 dr+o% dW,, (2.165) 
integrating and using equation (2.156), we find 
t t 
r= fo(t) + f ož00Z0 dr+ f o0 aW.. (2.166) 
0 0 


At this point one can apply the rule for differentiating an Ito integral, 


a [nc naw, | = h(t, t) +f OMT ny 





f 2.167 
ot Ot g ( ) 


where A(T, t) is any smooth function. By differentiating the integral expression for r, and 
: : Z(t) . i 
again using o; ` = 0, we obtain the stochastic process for the short rate as: 


t t 
an =| RO] [ [920220 +(o2Jar+ [022 aw, [ha 
0 0 
+072 aw,. (2.168) 


The risk-neutral drift for the short rate is, therefore, non-Markovian, since it has a dependence 
on stochastic variables for times earlier than t, as given by the integral and stochastic integral 
over all times T = 0 to t= t of factors involving the bond volatilities and their derivatives. 


Problems 


Problem 1. Suppose we have n independent risk factors. The instantaneous forward-rate 
process of equation (2.162) then takes the form 


df, (T) = uh” dt+> of awi, (2.169) 
j=l 
where of 9 are volatilities corresponding to the jth risk factor. Show that equation (2.164) 


generalizes to 


i n T F 
uf) => of) f of de (2.170) 
j=l f 


Problem 2. Using the result of Problem 1, obtain the multifactor extension for the short-rate 
process given by equation (2.168). 
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2.4.2 


Brace—Gatarek—Musiela—Jamshidian with No-Arbitrage Constraints 


The reader can observe that in all previously presented treatments of the yield curve, including 
HJM, the theory has made use of either a continuum of discount bonds (i.e., of any maturity) or 
a continuum of instantaneous forward rates. Such continua provide a basis for the description 
of points on the yield curve lying on just discrete time intervals, as in LIBOR-based instru- 
ments. In contrast, in this section we briefly present the BGMJ (after Brace, Gatarek, Musiela, 
and Jamshidian), which models discrete market quantities, namely, the LIBOR rates [BGM97]. 

Within BGMJ one considers a situation with a lattice of n maturities T; = T, +i7, i= 
0,1,2,...,n—1, and the corresponding simply compounded forward rates FOT, T,,,) for 
a finite period 7 (e.g., 1 month, 3 months, 6 months). Recall the formula for the forward rate 
in terms of discount bond price ratios, 





Z,(T;) 
14+7 7 T,, T, = Dae 2.171 
+ fi ( i i1) Z, (Ta) ( ) 
To keep the notation simple, we now introduce the symbol (for given 7) 
EADY = fP (T; Taa). (2.172) 


Moreover, we present the treatment within a one-factor notation, although the extension to 
many independent risk factors readily follows, and we leave this as an exercise. We now 
proceed by assuming that each LIBOR rate is a random variable obeying an SDE of the form 


dL (T) _ 


L(T;) L(T;) 
dt+ dW,; 2.173 
LAT) Br Or t ( ) 


similarly, for each maturity one writes an SDE for each discount bond price process as 


aZ,(T;) Z(T;) Z(T;) 
= ” dt+ “ dW,. 2.174 
Z,(T,) Mt O; t ( ) 





Here we have used shorthand notation to denote the drifts and volatilities, which can generally 
be functions of t, T;,, and the underlying rate or bond price: 


we = p" (t, Tp L(T)), oF = o" (t, T, L(T)), 

ECSR ETA a Se Ta: (2.175) 
Also, we assume to be in a basis of risk factors with no correlations among the LIBOR rates 
and bond prices. The addition of correlations along with the multifactor extension of the 
formulas is fairly straightforward and will be left as an exercise. 


Taking the stochastic time-t differential of equation (2.171) on both sides and using Itô’s 
lemma in the form of equation (1.137), one finds 





l Z,(T, ae er 
TL (Tue dt> otaw) = FE | lg,- oF (0? — aya 
ZG) 


Notice that here g, is a price of risk, which is generally nonzero since the underlying measure 
is not necessarily assumed to be the forward-neutral measure, wherein forward rates are 
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martingales. Typically, the measure can be chosen to be the so-called spot-LIBOR measure, 
under which Z,(T,)~! [and not Z,(T;,,)~'] is a martingale. In this case the forward rates are 
not martingales. 

Using equations (2.171) and (2.172) within the last equation and equating coefficients in 
dW, gives a recurrence relation among the bond volatilities at the different maturities: 





L(T;) 
ZED a om _ TET or" (2.177) 
i i 1+ 7L,(T;) 
This is easily iterated to give 
i L(T,) 
oT) = of) y TL (To; * : i>. (2.178) 
is, Perl) = 


On the other hand, the drift of the LIBOR forward rates is given by equating coefficients in 
dt in the preceding SDE while using equation (2.177); hence, 
pr = of (q, = of tH?) 
L(Tx) 


L(T;) za) | Ó TL AT) 
= = E e e 2.179 
a: (« oy ao (2.179) 


Possible specifications of the volatility are of the form 


CP = L (T)Po(lt, T), (2.180) 

where 8 = 1 corresponds to lognormal models and B = 5 to square-root models. 

We conclude this section by providing a pricing formula for the special case of the 

lognormal model with 6 = 1. In particular, the pricing formula for caplets of tenor 7 and 

with settlement at one of the maturities T; can be computed in analytical closed form. Using 

similar methods as discussed in previous sections, one can arrive at a Black-Scholes type of 
pricing formula for a caplet struck at rate k and tenor T: 


Cpl, (T;, K) = 1Z,(T;+7)[L,(T;)M(d,(t, T; k)) —KN(d_(t, T;, K))], (2.181) 


where 





K 


a(t, T) 


log (22) + a(t, TY 
L(t, T, K) = 


a 


, (2.182) 








N(-) is the cumulative standard normal distribution function, and the unnormalized average 
LIBOR rate volatility is given by 


a(t, T) = / GORY ae. (2.183) 


Swaptions are more problematic. Pricing a swaption struck at rate k requires an evaluation 
of an expectation under the measure with Z,(7;,,) as numeraire, 


PSO, (7, «) = 7 22,7) E TLT) — w) Lp} (2.184) 


i=1 
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where 1p is the indicator function of the set D of paths for which the payer’s swaption ends 
up in the money, i.e., the set 


p={(j-w =o}, (2.185) 


where the swap rate rẹ is given by equation (2.23). The random variables L;(T;), 7=1,...,n, 
are also assumed to be correlated. 


2.5 Real-World Interest Rate Models 


Modeling the real-world evolution of interest rate curves over long time periods is interesting 
for applications in risk management for assessing overnight risk. In corporate finance as 
well, these models are used to assess the risk exposure over a time horizon of several years 
for portfolios of interest-sensitive assets. Acceptable models should ensure that all forward 
interest rates are positive at all times and will involve some sort of principal-component 
analysis. In this section we discuss a simple model with the salient features. 

An acceptable model meeting the no-arbitrage condition for forward rates is conveniently 
formulated in terms of the logarithms of forward rates. Consider a situation with a finite 
number of key rates for maturity times T,, T,,..., Ty. A possible choice of T,, following 
RiskMetrics™ [Mor96b], is to select the terms 1 m, 2 m, 3 m, 6 m, 9 m, 12 m, 2 y, 3 y, 4 y, 
5y,7y, 10y, 15 y, 20 y, 25 y, and 30 y. Consider the logarithms of the time-t forward rates 
for the intervals [T;, T;,.;]: 


x,(i) = log f,(T;, Tis). (2.186) 


There are several different ways to go about performing statistical estimations. 
If short-term scenarios over time horizons of 1—10 days are sought, one can study the 
log-returns over the desired period of the time series x,(i), 


ôx, (i) =x) — x1), (2.187) 


and estimate the covariance matrix as the historical expectation 


C; = E[dx(i)- 8x(j)] = Y ôx, (ôx, 0). (2.188) 


Here we assume a return time series of length M. Over short time horizons, the fat-tailed 
character of return distributions is an important feature to take into account. It appears that the 
degree of kurtosis depends on the term, with shorter maturities being more sensitive to shocks 
caused by changes of Central Bank policies. In this case, a possible approach is to estimate 
each term separately. Another approach is to perform portfolio-dependent estimations. The 
latter method is more accurate but less general. 

If long-term scenarios are sought, it is appropriate to compute a singular-value decompo- 
sition of the rectangular matrix Y made up of the mean subtracted-time series 


vÒ = x,(i) — Efx (i)] = x()— à 2,0). (2.189) 
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Here the expectation is again computed by taking historical averages. M is the number of 
historical data points, and N is the number of forward dates. The matrix Y has M rows and 
N columns, and its singular-value decomposition 


Y=U-S-V’ (2.190) 


involves an M x M matrix U, an N x N matrix V and an M x N diagonal matrix S of 
singular values. The columns of the matrix V are the principal components, denoted with u“ 
(see Figure 2.9). If one projects the time series y, along the principal components, one finds 
times series for the component scores: 


N 
nf => y Òu (ò), a=1,...,N. (2.191) 
i=l 


The component scores show a clear tendency to follow a mean reverting process. 
A statistical model can be built by first finding the auto-regression coefficients m(a) such 
that 


Nf — Ni = —m(a) yy + ue; (2.192) 


Here, the coefficients m(@) are computed by solving a least-squares problem. Second, one can 
postulate that the residuals €% are normally distributed and estimate the covariance matrix as 


1 M 
C? = E[n nf] = Daa. (2.193) 
t=1 


It is common for portfolios to be sensitive to rates in one currency as well as to interest 
rates in foreign currencies and on the exchange rates as well. Hence, one can consider the 
case of R interest rate discount curves Zi(T), i=0,...,R—1, and R currencies, giving rise 
to (R—1) independent exchange rates X‘,i=1,...,R—1, giving the worth of one unit of 
the ith currency in the base currency with i = 0. In this case, long-term statistical estimations 
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FIGURE 2.9 Three typical principal components for the forward curve as a function of the key maturity 
dates (i.e., the term) using a time series of U.S. Treasury curves. 
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must also account for the arbitrage condition, yielding forward exchange rates in terms of the 
spot exchange rates and interest rate curves. Namely, 
iZ:(T) 
‘Z(T) 





F(x',T) =X (2.194) 


For a long-term statistical analysis, one can still accomplish a principal-component analysis 
and estimate the mean reversion rates for interest rates along the same lines. In addition, 
one needs a model for the spot foreign exchange rates, which, jointly with the no-arbitrage 
constraint in equation (2.194), yields all of the foreign exchange curves. 


CHAPTER «3 


Advanced Topics in Pricing Theory: 


Exotic Options and State-Dependent 
Models 


Exotic options is a term used to describe derivative securities having cash flow or payoff 
structures that are more intricate and more complex than standard contracts such as plain- 
vanilla calls and puts. One main reason for trading, and hence pricing, such contracts is that 
they permit a much larger degree of flexibility for use in risk management and speculation. 
The payoff structure of these contracts can be fabricated to provide a higher leverage from 
an investor’s viewpoint. Examples of this arise in so-called barrier options, the pricing of 
which is presented in great detail in this chapter. The theoretical pricing and hedging of exotic 
as well as standard derivatives depends largely on the stochastic model employed for the 
underlying asset price processes. The study of various models for the underling asset price 
process is therefore of importance to pricing theory as a whole. 

This chapter is largely devoted to the development and application of exact solution 
methodologies for pricing derivatives under state-dependent asset price processes. A fairly 
general mathematical framework is presented for obtaining pricing kernels satisfying various 
boundary conditions. The kernels are then used to obtain new families of analytically exact 
closed-form pricing formulas for standard as well as barrier-style European options under 
various types of multiparameter state-dependent volatility models. The approach we take 
for tackling state-dependent models is of a general nature whereby we solve for the most 
fundamental quantities: the pricing kernels or transition probability density functions. This, 
in turn, is achieved by introducing a new and special type of “mapping” of the original state- 
dependent diffusion problem onto a related, yet simpler, diffusion problem corresponding 
to an appropriately chosen, simpler underlying process. The original diffusion problem is 
essentially reduced to a simpler diffusion for which exact pricing kernels are obtained by 
means of more standard methods. Once a kernel for the simpler underlying diffusion process 
is obtained, pricing kernels for a family of more complicated state-dependent volatility models 
are generated by direct substitution into a formula that provides an exact relationship between 
any two kernels — one for the simple diffusion and the other belonging to the family of 
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kernels for the original state-dependent volatility model. The derivation of this useful formula 
is discussed at length in this chapter. Throughout this chapter we refer to the underlying 
(simpler) diffusion process as the so-called x-space process, while the price process of interest 
(i.e., the more complex process we wish to describe for pricing) is referred to as the F-space 
process. The process F, can be used to denote either an asset price or a forward price at 
time t. 

Two particularly useful choices of underlying x-space processes are (i) the Wiener process 
and (ii) the Bessel process. We present exact solution methods for the transition density 
functions (i.e., the x-space kernels) for the Wiener and Bessel processes, separately, subject to 
nonabsorbing as well as all types of absorbing boundary conditions that correspond to either 
single- or double-barrier cases. The single- and double-barrier pricing kernels in the forward 
(or asset price) space of interest are then immediately generated by direct substitutions 
via our main formula. We shall see that the F-space pricing kernels for the linear and 
quadratic volatility models with two distinct roots can be generated simply from the standard 
Wiener densities. More complex and more abundant state-dependent pricing kernels arise 
from underlying densities for the Bessel process. In particular, a considerably larger family of 
analytically exact (F-space) pricing kernels containing as many as six adjustable parameters, 
which we shall refer to as the Bessel family, is generated from the underlying Bessel process. 
The Bessel family of solutions involves Bessel functions, as the name naturally suggests. This 
family is quite elaborate in structure because it is also shown to represent the exact solutions 
to most of the popular pricing models, including the linear, quadratic, and constant-elasticity- 
of-variance (CEV) volatility models as special cases. Some applications of the Bessel family 
of pricing kernels to option pricing are discussed in this chapter. 

The first section introduces barrier options. The mathematical framework for obtaining 
probability densities for a process involving absorption at a barrier is then introduced in 
Section 3.2, where the simplest case is considered: a single-barrier Wiener process. The 
method of images is used to obtain the Wiener density for one absorbing barrier. Building 
on the results of Section 3.2, exact pricing kernels as well as single-barrier option formulas 
for the affine (linear volatility or lognormal model) and quadratic diffusion models are 
presented in Section 3.5. The method of Green’s functions is then presented in Section 3.6 
for solving the Kolmogorov partial differential equations for the kernel. In particular, we 
consider an underlying x-space diffusion process and show how analytical formulas for the 
time-dependent transition probability density for (barrier-type) absorbing boundary conditions 
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FIGURE 3.1 Sample asset price paths hitting a lower or upper barrier. 
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as well as nonabsorbing (barrier-free) conditions are generated via the (time-independent) 
Green’s functions. In doing so, we also briefly present the basic important features of the 
Sturm-Liouville theory of ordinary differential equations for obtaining Green’s functions. The 
Green’s functions are obtained in two forms: (i) as special functions and (ii) as eigenfunction 
expansions. Green’s functions of the first form lead to exact closed-form solutions for the 
transition density, generally in terms of special functions, whereas Green’s functions of the 
second form give analytical series expansions for the kernel. The Green’s functions formulas 
are then used in the subsequent sections to obtain transition densities for the Bessel process 
via complex variable contour integration methods. We then show how these densities can be 
used to directly generate new pricing kernels and European option pricing formulas for new 
families of diffusion models. Formulas are presented for: barrier free, single barriers, and 
double barriers. A discussion on the hierarchy of state-dependent models is also presented 
in light of the Bessel family as providing a model that recovers solutions to a class of 
popular models. 


3.1 Introduction to Barrier Options 


A barrier option is a particular kind of exotic option because it is to some extent path 
dependent. That is, the option’s pay-off and hence value depends on the realized underlying 
asset path via the level attained any time before a given maturity time T. That is, if one 
considers an asset of price A, (e.g., a stock price), then a barrier for an option contract is 
generally given by a time-dependent price threshold H,, t < T, on which the pay-off depends. 
[Note: As seen later, most standard barrier option contracts are structured as having a fixed 
i.e., time-independent) barrier level or levels for a chosen underlying asset price.] Barrier 
options can be conveniently characterized in terms of stopping times. Let us denote 7(A, H) 
as the minimum time 7 € [t,, T] for which the asset price A,, starting at Ay at current (initial) 
time t = fo, first crosses or hits the barrier at level H,, i.e., the first time 7 for which A, > H,. 
Note that the stopping time is dependent on the complete path A, and the barrier level H, at 
all times f € [fp, T]. 

There are two basic types of single-barrier options: (i) knockout options, which have a 
nonzero pay-off only if a level H is not attained, and (ii) knock-in options, which have a 
nonzero pay-off only if the level H is attained before or at maturity time T. There are then 
different flavors of these corresponding to whether the barrier level H is placed above (sin- 
gle upper-barrier option) or below (single lower-barrier option) or both above and below 
(double-barrier option) the initial asset price. We refer the reader to the project in Part II of 
this book for further details on these contracts and how one can go about hedging them with 
plain-vanilla puts and calls. These and other examples of elementary single-barrier options and 
their corresponding payoff structures can be characterized in terms of stopping times, as follows. 


(i) Knockout options with pay-off at time T: 


$(Ar)(1—1,27), (3.1) 
and knock-in options with pay-off at time T: 
(Ar) Ler, (3.2) 


with single-barrier level H. Here @ is a certain payoff function [i.e., (A) = (A— K), 
for a call struck at K], T = 7(A, H) is the stopping time for barrier level H, and 1, is 
the indicator function taking on value 1 or 0 if event B occurs or not, respectively. For 
double-barrier knock-in/knockout options with lower level L and upper level H > L, 
the pay-off is of the same form, where the indicator function in the foregoing two 
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expressions is now replaced by Iyin(z,,7,)<r With Tz, Ty as stopping times for hitting 
levels L, H, respectively. 
(ii) Corridor options with two barrier levels H< H® and pay-off at time T: 


(Ar) ln rl cr (3.3) 


where 7, = 7(A, H®), 7, = 7(A, H) are stopping times for hitting the two respective 
levels. Corridor options hence have a nonzero pay-off only if the asset price hits both 
levels before time T. 

(iii) Pay-at-hit one-touch options with pay-off at time 7: 


(A)r: (3.4) 


In contrast to the previous contracts, here the pay-off occurs at the stopping time rather 
than at maturity T, which is given by 7 = 7(A, H) in the case of a single level H. 
(iv) Upper-wall options, with payoff 








1 T 
of OAM en, dt 3.5) 
0 “40 
and lower-wall options, with payoff 
1 T 
Foy J OAD <n, dt. (3.6) 
0 “40 


The pay-offs of these contracts are given by the time average of a certain pay-off over all 
time intervals for which the asset price is above or below the barrier level H,. 

These elementary pay-offs can be engineered together to create more complex structures. 
These options are path-dependent securities and their price is affected by the dynamics of the 
implied volatility surface. From the modeling point of view it is often convenient to work in 
the space of the forward price process F, = F,(A, T). 


Single-Barrier Kernels for the Simplest Model: 
The Wiener Process 


Driftless Case 


Recall equation (1.86), which is the probability density for free Brownian motion with drift 
and no barriers (i.e., with nonabsorbing homogeneous zero-boundary conditions imposed at 
too). Setting the drift to zero gives the transition probability density for a pure Wiener 
process x,, with constant volatility. Let us reconsider the Wiener process x,, obeying the 
SDE: dx, = v(x)dW,, with constant volatility function! v(x) = /2, zero drift, and focus 
now on solving the corresponding forward and backward Kolmogorov partial differential 
equations: 





2 


0 
gre ty Xo, t) = za! ts Xo, to) (3.7) 


l This choice of volatility proves convenient because solutions for arbitrary constant volatility v(x) = o = const 
obtain by a simple time scale change, i.e., by the replacement t > iot, to > 50° ty within the solutions for 


v(x) = V2. 
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and 


2 


ô ô 
T t; Xo, to) + ae t; Xo, ty) = 0, (3.8) 


subject to delta function initial (or final) time condition in the case of the forward (back- 
ward) equation: lim,_,,, u(x, t; Xo, tọ) = 6(x— xo), t— tọ > 0. More formal methods for solving 
equation (3.7) or (3.8), in the case of general time independent volatility and drift func- 
tions, by application of Laplace transform and Green’s functions techniques, are discussed 
in Section 3.6. In this particularly simple example, however, we simply make use of the 
solution for the barrier-less case obtained in Chapter 1. Namely, the solution u(x, t; xo, tọ) = 
go(x, Xo; T) for the infinite domain x, x) € (—0«, oo), allowing paths to attain any finite value, 
is simply 


e7 (27x0) /4T 


gox, Xo; T) = E (3.9) 
Note: Throughout this section we define 7 = t — tọ. In most of what follows we shall work 
in terms of this time quantity, since the drift and volatility terms are not explicitly time 
dependent, hence giving rise to time-homogeneous solutions dependent on 7. The boundary 
conditions are homogeneous: lim,_,,,, 89(x, Xo; T) = 0, given any x9, and for the backward 
equation lim, +% 80(*, xo; T) = 0, given any x, and finite time 7. This so-called elementary 
solution can be used to obtain the solution to any other initial-value problem satisfying 
equation (3.7) [or (3.8)] and obeying homogeneous boundary conditions on the infinite 
domain. Indeed, the solution to the forward-time equation (3.7) for an initial distribution 
condition u(x, t = tọ) = f(x) is given by the integral 








u(x, t) = he F(X) Bo(X, xo; T)dXo. (3.10) 


The function go(x, xo; T) is also referred to as a time-dependent Green’s function or kernel 
or fundamental solution for the preceding diffusion process. Physically, this corresponds to 
the transition probability density of the random variable x, having value x, at an initial time 
to(7 = 0) and taking on the value x at a later time t. For any time value 7 > 0 and any 
fixed initial value x9, one readily verifies that this Gaussian-shaped density integrates to unity 
exactly over x € (—oo, 00). In the limit T — 0 the kernel is the delta function, thereby also 
integrating to unity, as required. This kernel hence corresponds to the case of no absorption 
outside the entire region; i.e., probability is conserved in the entire region x € (—oo, 00). 

Let us now consider a solution to the forward-time equation (3.7) by imposing a zero 
boundary condition at a finite upper-barrier value x = xy, i.e., u(Xy, t, Xo, to) = 0, with 
solution region of interest defined by xọ, x < xy. As is seen shortly, this gives rise to 
absorption of paths (at x = xy) into the region outside the interval (—0o, xy). We will now 
demonstrate the use of the so-called method of images. In this technique the exact solution to 
the forward-time Kolmogorov equation, for arbitrary initial condition u(x, t = tọ) = f(x), is 
obtained by extending the (“physical”) region x < x, to include the (“nonphysical’”’) region 
x > Xy Via the definition 


{ f(x), X < Xy 
fœ) = (3.11) 


— f(2xy — x), X>Xy- 
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This function is antisymmetric about the point x = xy: f (xy — €) = —f (xy + €) for any e€ > 0. 
Then using the solution in the form 


u(x,t) = [Pagal xo: dx 6-12 


with g) given by equation (3.9) one can easily show by a change of integration variables 
that u(x = xy, t) = 0. This is a consequence of the antisymmetric property. By splitting this 
integral into the regions (—oo, xy] and (xy, 00), using equation 3.11, and changing integration 
variables in one of the integrals, one finally has the solution to the initial-value problem on 
the interval x € (—oo, xy] satisfying the forward-time PDE of the form in equation (3.7), with 
u(x, t = tọ) = f(x) and zero-boundary condition u(x = xy, t) = 0: 


XH 
u(x, t) = Í F(x)" (xy, X, Xo; T)dXo, (3.13) 
where 


8" (XH, X, Xo; T) = Bo(X, Xo; T) — 8o (x, 2X — Xo; T) 


= 8o(X, Xo; T) — 8o (2Xy — X, Xo; T) 
o 1 
Oar 


This last quantity is hence the time-dependent Green’s function or kernel u(x, t; Xo, fo) = 
g“ (Xy, X, Xo; T) for the Wiener process in the region x9, x < Xy, with the condition that there 
is absorption at the barrier level x = x,. The fact that absorption occurs when imposing a 
zero boundary condition on the solution u at a finite level is examined more precisely later, 
where we also show explicitly why g"(xy, X, Xo; T) is considered a probability density for 
Wiener (Brownian) paths starting from x) < x, and ending at any point x < x, in time 7, 
conditional on absorption of all paths crossing the barrier level x,. Note that g” is given 
by subtracting the original (i.e., no-barrier) density g} centered at xọ with the same density 
centered at 2x, — Xọ within the nonphysical region x € (xy, 00) (see Figure 3.2). This is 
essentially the reflection principle arising from the method of images, where the image 
source is a sink at the point 2x, — x9. Since g” is a linear combination of two solutions to 
the Kolmogorov equations (which are linear partial differential equations), g“ as given by 
equation (3.14) is then also a solution to the Kolmogorov equations and, moreover, is readily 
seen to satisfy the required zero-boundary condition at the barrier, g"(x,, X = Xy, Xo; T) = 0, 
as well as g” (xy, x = —0, xo; T) = 0. Using the delta function definition, we have lim,_,) g8“ = 
6(x — xo) — 6(x — (2x, — X9)). Hence from the integral property of the delta function, the 
solution given by equation (3.13) is indeed shown to satisfy the required initial condition. 
Note that the second delta function does not contribute to the integral, for it is centered in the 
nonphysical region and is precisely the term that acts as a so-called sink (or negative point 
source), as mentioned earlier. 

The foregoing method applies in identical fashion if we are interested in obtaining solutions 
within the upper half-line region x), x > xz, where x; is now any finite lower-absorption 
boundary point with u(x = x; , t) = 0. In this case the kernel u(x, t; xp, to) = g!(X,, X, Xo; T) for 
the Wiener process in the region xo, x > xz, given the absorption condition at the lower barrier 
level x = x, is given by g!(x,, X, Xo; T) = &y(X, Xo; T) — g(x, 2X, — Xp; T), and equation (3.13) 
is replaced by 


(e7047 z e7 40-249)" /Ar) : (3.14) 








u(x, t) = f feve' X, Xo; T)dXp. (3.15) 
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FIGURE 3.2 A sample plot of the kernel g"(x,,x,%9;7) for absorption at an upper barrier with 
parameter choices x, = 0.5, x) = —0.5, T = 0.75. The thicker solid line gives g“ in the physical 
solution region, while the dashed line extends into the nonphysical region. The plot of g” is obtained by 
subtracting two barrier-free kernels (i.e., summing the two thin solid lines): g(x, X93 T) — go (x, 2Xy — 
Xo; T), where 2x, — xX) = 1.5. 


This therefore gives the solution to the initial-value problem on the interval x € [x,, œœ) 
satisfying the forward-time Kolmogorov PDE with arbitrary initial condition u(x, tọ) = f(x) 
and zero-boundary conditions u(x,, t) = 0. We note that if f(x) is integrable over the entire 
solution domain, then u(co, t) = 0 also. Due to the symmetry of the Wiener process, we also 
have g'(x,, xX, Xo; T) = g" (Xp, X, Xo; T) for any real barrier value x,. This follows from the 
symmetry of the Green’s function go(x, xo; T) = go (Xo, X; T). 

It is important to observe that our analysis can be applied similarly to solve the backward- 
time Kolmogorov PDE, where tọ = t now corresponds to a final-time condition instead of an 
initial-time condition. The foregoing transition density function gy also satisfies the backward 
PDE with zero-(homogeneous)-boundary conditions at infinity, lim, ->+ 8o(%, Xo; 7) = 0, 
given any x. If a zero-boundary condition is placed at some upper level x, = xy, then 
the solution kernel for equation (3.8) on the interval x, x9 € (—00, xy] is again given by 
u(x, t, Xo, to) = g" (Xy, X, Xo; T) since expression (3.14) satisfies the backward PDE and 
a" (Xy, X, Xo = Xy; T) = 8" (Xy, X, Xg = —œ; T) = 0 for any fixed x. In general, the solution 
to the backward PDE with arbitrary final-time condition u(xọo, tọ = t) = Ọ(xọ) and kernel 
u(x, t; Xo, tọ) can be represented as 





u(Xo» b) = i (x) u(x, t; xo, t) dx, (3.16) 


where the integral is over the appropriate solution interval D and u is the kernel with 
appropriate boundary conditions imposed at two endpoints. In particular, the solution with 
zero-boundary condition imposed at the endpoint xy = xy is given by the integral 


U(X, to) = [i p(x)g" (xy, x, Xo; T)dx, (3.17) 
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while for zero boundary condition at a lower endpoint x» = x, 
u(Xps t) = f Pex, X, Xo; THX. (3.18) 
XL 


If @ is further assumed to be a compact integrable function over the entire solution domain, 
then u(Xp, tọ) will also have zero-boundary condition as x, —> oo accordingly. 

It is instructive to reconsider the preceding absorbing barrier problem from a different 
point of view using purely probabilistic arguments and basic properties of Brownian paths. 
In particular, let x, denote the Brownian motion starting at x) < x, at initial time tọ with 
upper absorbing barrier at x = xy. Let x, denote the same Brownian motion but with no 
barrier, i.e., the standard Brownian (or Wiener) process with transition density go(X,, Xo; T), 
T = t — tọ. Let us focus on the case of an upper barrier (the derivation for the case of a lower 
barrier is similar; see Problem 4 of this section) and set out to compute the probability that a 
path x,, fo < s < t, has the value of X or less at time t, where X < xy: 





P{x, < X} = P{x, < X, sup x, < xy}. (3.19) 


to<s<t 


This expression follows from the fact that if a free Brownian path x, crosses the barrier, x, 
will be absorbed and hence would never attain a value below xy. Now, from first principles 
the total probability for the event 


{x, < X} = {x, < X, sup x, <xy}U{x, < X, sup x, > xy} 


foSsst toSs<t 


is given by the sum of the probabilities of the two mutually exclusive events: 


P{k, < X} = P{x, < X, sup x, < xy}+ P{x, < X, sup x, > xy}. (3.20) 


to<s<t to<s<t 


Any path contributing to the second term must therefore cross the barrier. The density for 
the x, motion is given by go(X,, xo; T), so x, follows a symmetric random walk in time. In 
particular, if we let t,, < t denote the time at which a path first hits xy, then the probability 
density that a Brownian path at xy at time t4 subsequently attains the value X at terminal 
time ¢ is the same as that for a (reflected) path starting at x, at time ty and attaining a value 
2x, —X at time t (see Figure 3.3). Indeed, for both paths this probability density is 


e~ (X—xH)"/4(t-ty) 





8o(X, Xg; t — ty) = 8o(2Xy — X, Xy; t— ty) = ara (3.21) 
Using this, the second term in equation (3.20) becomes 
P{k, < X, sup x, > xy} = P{x, > 2x4 — X, sup xX, > xq} 
to<s<t to<s<t 
= P{x, > 2x, — X}, (3.22) 


where the last term follows because the supremum condition is redundant. Substituting this 
result into equation (3.20) and using equation (3.19) gives 


P{x, < X} = P{x, < X} — P{x, = 2x, — X} (3.23) 


for all X < xy. 
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FIGURE 3.3 The reflection principle for Brownian paths. 


Placing the density g, into equation (3.23) hence gives the probability of any path initiating 
below the barrier at x) < x, and attaining any value x, < X < x, within a time interval 7, 
with the condition of paths being absorbed if the barrier level xy is crossed, as: 


X o0 
P{x, < X} =f 8o(X, Xo; ndx- f &o(X, Xo; T)dx 
—0o 2xy—X 


X 


= Í 8g" (Xy, X, Xo; T)dx, (3.24) 


where the last expression is obtained by a change of variable in the second integral. Since the 
density is obtained by differentiating the cumulative probability function (or by the standard 
definition of a cumulative density function) we conclude that the kernel g”(x4, X, Xo; T) in 
equation (3.14), as derived earlier by the method of images, is indeed the transition probability 
density for Brownian motion x, on the interval x, x) € (~œ, xy] with an absorbing barrier 
at xy. The probability in the last equation is readily evaluated as the difference of two 
cumulative normal functions: 


Pls, = x)= (2) n( =), (3.25) 


T = t — tọ. The absorption of paths crossing the barrier can then be quantified precisely as 
follows. Let P(T) denote the probability of any path initiating at x) < x, and terminating 
within time 7 in the interval x € (—oo, xy], conditional on absorption at xy. Then P(T) = 
P{x, < xy}, where the conditional probability is computed using the density g“: 


(a) =n( 2) n( m] (3.26) 


Hence the probability does not integrate to unity and is in fact time dependent with 
P(r) < 1, implying absorption with 1 — P(T) giving the probability of absorption. Moreover, 
P(t) > 1 as T > 0 and P(T) > 0 as T > œ. One can also compute the rate of absorption 
R(T) = —dP(T)/dr or flux across the barrier (i.e., the rate at which probability leaks). From 
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equation (3.26) (and the analogous formula for the case of a lower barrier, wherein xy — Xo 
is replaced by x) — xz), we generally have 


[p= Xol neyo? 
R(T) = Pe Ot, (3.27) 
2V TT? 


where x, is either a lower or an upper barrier and x, is above or below the barrier, respectively. 
Brownian Motion with Drift 


The analysis of the previous section is readily extended to the case of a constant drift u 
and constant volatility o, i.e., drifted Brownian motion x, with stochastic increment dx, = 
u dt+o dW,. The transition density for this process with no barrier [recall equation (1.86)] is 


e7 (xou)? 2077 





80,u(X, Xo; T) = (3.28) 
ONTT 
Rewriting gives 
3 ‘ u2 
Eo, u (X, Xo; T) = e2 02" g (X, Xo; T), (3.29) 
where 
e710) /20°r 
8o(X, xo; T) = — (3.30) 
ONTT 


is the corresponding density for zero drift and no barrier [i.e., the density in equation (3.9) 
with T > so T]. A transition probability density function for the drifted process, denoted by 


U, = U,(X, Xo; T), is a fundamental solution to the forward and backward time-homogeneous 


Kolmogorov equations, which can be respectively written as 


Ou 1 „8u ðu 
u 2° “u u 
ar 2” a ax Gan 





and 


ðu 1 „u ðu 
u oe Kh u 
= ; 3.32 
Of D N O22) 





with delta function condition lim, o u, (x, Xo; T) = 6(x — xq). For the case of free motion on 
the entire infinite domain, we have u, = go,,,, since this kernel solves equations (3.31) and 
(3.32) with zero-boundary conditions at x, x) > +œ and lim,_,o 89 ,(, Xo; T) = ô(x — Xp). 
As in the case of zero-drift, we are interested in further obtaining kernels satisfying zero- 
boundary conditions at any specified finite barrier level. For this purpose, relation (3.29) 
points to the following generally useful result. 





Proposition 3.1. Let u,,(x,X9;7) be a fundamental solution to the Kolmogorov equa- 
tions (3.31) and (3.32) for drifted Brownian motion and satisfying homogeneous zero- 
boundary conditions (in x or x) at any two endpoints of a finite, infinite, or semi-infinite 
solution domain. Assume the corresponding fundamental solution for zero drift (m = 0) is 
given by uo(X, Xo; T) = u(x, xo; T) and that this solution satisfies the same endpoint zero- 
boundary conditions, we have the relation 


u2 


u(x, X0; T) = e20- u(x, Xo T): (3.33) 
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The solution in equation (3.33) is verified by directly substituting into equations (3.31) 
and (3.32), differentiating, and using the fact that u(x, x9; T) solves the same forward and 
backward Kolmogorov equations for u = 0. In the limit T > 0, u,, obviously approaches the 
delta function since u does. Moreover, note that the exponential term in equation (3.33) is 
bounded for all finite values of x, x9, and grows only with linear exponent at infinite absolute 
values of x or xọ. Hence, any zero-boundary condition on u (placed at a finite or infinite 
point in x or x9) is automatically also satisfied by u, at the same point. 

Based on the foregoing proposition, the barrier kernels for the drifted Wiener process are 
automatically obtained from those for zero drift. Although in this section we are explicitly 
discussing only the single-barrier case, the reader should realize that the proposition also 
applies directly to the case of the double-barrier kernels. Using equation (3.14) (with the 
replacement T > +0°T) for the case of an upper absorbing barrier at x = xy, the transition 
density denoted by u, = g, on the domain x, x9 € (—co, xy] is then equivalently given by 


$ 
erz =t) -zT 
u 
8u (Xy X, Xo; T) = 
a ONTT 


2 
= 8o,u (x, Xo; T) — ect Cu g(x, 2X4 — Xo; T) 


Ge =, ee 


= go, (xX, Xo; T)[1 — eTit tH Ho)) (077) (3.34) 


where the function go, is defined by equation (3.28). This density satisfies zero-boundary 
conditions at the barrier level x, xy) = x, as well as at x, x) > —oo, as required. The kernel 
for the case of a lower barrier at x = x, is identical with transition density for x, x9 € 
[x,,00) given by Su (xz, X, Xo; T) = Bi (XL X, Xo; T), with zero-boundary condition at x, x) = 
x, and at x, xy — oo. It is easy to verify by comparison of the relative magnitudes of the 
exponents that these densities are indeed strictly nonnegative on their respective semi-infinite 
solution domains. 

These kernels can be used to provide analogous probability formulas to those in the 
previous section. For example, the kernel g’ can be used to compute the probability that a 
drifted Brownian path initiating at any point above the barrier at x) > xz, at time tọ, and 
attaining any value x, > X, for X > x,, within a time interval t— tọ = 7, conditional on the 
path being absorbed if it crosses below the barrier level x,, as 


Plx, > X > xxo > x,} = gi (xz, X, Xo; T)dx 
x 


-N -Xer -ežocoy( 2 Xtar), (3.35) 
O/T o/T 


The analogous probability for the case of an upper barrier at xy is 





x 
P{x, < X < xpq|X% < Xy} = 8u (Xn X, Xo; T)dX 


= (SSF) _ Berm y( AES), (3.36) 
oft oft 





Problems 


Problem 1. Consider the Wiener process with lower absorbing barrier as discussed in 
Section 3.2.1. Obtain analogues of equations (3.19) through equation (3.27). Provide an 
analogous plot to the one in Figure 3.2 for the kernel g'(x}, x, xo; T). 
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Problem 2. What are the limiting values of P(T) and R(T) in equations (3.26) and (3.27) as 
Xy — œ? Explain. 


Problem 3. Obtain formulas for P(T) and R(T) for the case of a driftless Wiener process 
with constant volatility ø. Explain the dependence of P(T) and R(T) on volatility. What are 
the limiting values as 7 > œ and ø > 0? 


Problem 4. Consider driftless Brownian motion with constant volatility v(x) = ø and absorp- 
tion at a lower barrier x,. Using steps similar to those in equations (3.19) to (3.25), show that 
a path x,, fọ < s < t, conditional on starting at x) > x, at time tọ, has value x, > X at time t, 
where X > x,, with probability given by 


Pls; = x)= (==) (==), (3.37) 


where T = t — fy. Show that this result is consistent with equation (3.35) when u = 0. 





Problem 5. By using equations (3.35) and (3.36) with X = x, and X = xy, respectively, derive 
an expression for the rate of absorption across a barrier. Explain the particular dependence 
on the drift rate u. 


3.3 Pricing Kernels and European Barrier Option Formulas 
for Geometric Brownian Motion 


The kernels for the drifted Brownian motion obtained in the previous section can be used 
to provide exact pricing kernels and hence pricing formulas for which the underlying asset 
price process S, at time ft is assumed to obey a linear volatility and linear drift model (i.e., 
geometric Brownian motion or the standard Black-Scholes model): 


dS,= pS, dt+oS,dW, S, >0. 


Let us begin by defining the variable transformation x = X(S) = log(S), with inverse S = e”, 
mapping the domains x € (—oo, 0) and S € (0, œ) into one another. From It6’s lemma, the 
process x, = log S, has SDE 


Co 
dx, = (u- Z) dt+o dW,. 


Hence, the transition density for the random variable log S, is given by the transition density 
for the simple Brownian motion x, with constant drift u — +o and volatility o. Changing 
variables with Jacobian dlog S/dS = 1/S therefore gives a general relationship between the 
S-space and the x-space densities: 


U(S, So; T) = zu ee) (X(S), X (Spo); T); (3.38) 


M3 
for all S, Sy > 0. Here the notation u, refers to a kernel for simple Brownian motion with drift 
u, as discussed in the previous section. It is also readily shown by direct substitution, using 
equations (3.31) and (3.32), that the density U satisfies the appropriate forward and backward 
Kolmogorov equations in S, Sọ (i.e., the Kolmogorov equations for lognormal diffusion with 
linear drift and volatility functions uS and oS, respectively, as discussed in Section 1.13). 
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Relation (3.38) holds true for any homogeneous zero-boundary conditions. The case of 
zero-boundary conditions imposed on the pricing kernel U at S, Sọ > 0 and S, Sọ — oo 
corresponds to imposing zero-boundary conditions on the kernel u at x, x9 — +00. Such 
boundary conditions give free geometric Brownian motion on the entire half-line S, Sọ € 
(0, co). The pricing kernel for the case of no barriers, denoted by Up, is then obtained 
via equation (3.38) by substituting the barrier-free solution for drifted Brownian motion 
of the previous section uw, 142 = 8o,u-}o? (x, xo; T), with x = X(S) = log S, x) = X(So) = 
log So, giving 





1 
Up(S, So; 7) = 5 Eou? (log S, log Sp; 7) 


= l e7 log(S/S0)—(u— 0? /2)17 /207 7 (3.39) 


E OSN TT 


This is the familiar lognormal density for the Black-Scholes model discussed in Chapter 1. 
However, here we arrived at this density from a different perspective, one that allows us to 
readily derive pricing kernels subject to different boundary conditions. To obtain the pricing 
kernel for the case of a single absorbing barrier at S = H, the barrier points in the two spaces 
are related by x, = X(H) = log H. Then by simply substituting the appropriate single-barrier 
x-space kernel of equation (3.34) into equation (3.38) we obtain the equivalent forms: 

1 

U(H, S, So; T) = Fou jo log H, log S, log Sp; 7) 
1 


= y [80-402 (l08 S, log So; 7) 


2u 
— (H/So) ~ 8o u- 102 (log S, log(H?/So); 7)] 
= Up(S, So; 7) — (H/S) ~ UCS, H?/So; 7) 


= USS D| exp| puis win 


1,2 
zI T 





(3.40) 


where U, is given by equation (3.39). This single-barrier kernel hence satisfies zero-boundary 
conditions at the barrier value for both S = H and Sọ = H, as well as approaching zero 
as S, Sy —> 0 and as S, Sọ —> œ. Kernel (3.40) is therefore valid as a single-barrier kernel 
(transition probability density) for either the lower domain, S, Sọ € (0, H], or the upper 
domain, S, Sy € [H, œ), with level H being an upper barrier or lower barrier, respectively. 
The price level H therefore plays the role of either upper or lower barrier in the respective 
solution domains. 

Pricing kernel (3.40) can be used to obtain exact analytical formulas for various types 
of single-barrier European-style options under the Black-Scholes model where u = r, the 
assumed interest rate. If the underlying asset has constant dividend yield q, then u =r -— q. 
Without loss in generality, in what follows we derive explicit formulas for q = 0.2 Given 
an arbitrary payoff function A(S) at maturity time T, the fair value at current time tọ and 


?The pricing formulas for q Æ 0 obtain trivially from the q = 0 formulas. Indeed, let V(So, r, q, T) represent any 
option-pricing function for the case of a constant dividend q. Then from the discounted risk-neutral pricing integrals 
we directly have V(So, r, q, T) = e- 4 V(So, r — q, T), where the latter is the corresponding option-pricing function 
V(So, r, T) derived for zero dividend but with subsequent drift replacement r > r—q. 
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spot price Sy > H of a down-and-out option with barrier level H is given by the discounted 
risk-neutral expectation over the domain above the barrier: 


VPO (Sg, 7) = e7" Í, U(H, S, Sy; 7) A(S)dS, (3.41) 


where the option price is considered a function of T = T — tọ, the time to maturity. Recall 
from previous contract definitions that this option automatically expires worthless if the stock 
or asset price S, attains or falls below the barrier price level H for any time before maturity. 
The value of the corresponding up-and-out option with spot price Sọ < H is given by the 
discounted risk-neutral expectation over the domain below the barrier: 


H 
VUO (Sa, 7) = e7" [ U(H, S, So; T)A(S) dS. (3.42) 


The values of the knock-in barrier options (i.e., the up-and-in and down-and-in options) 
follow simply by (knock-in)-(knockout) symmetry: 


vs yY? = vor yee — V. 
where 


V(Sa, 7) = e7" f UCS, Sy; DA(S)dS (3.43) 


is the value of the plain European option. [Note that these integral solutions are consistent 
with the fact that V, V”, V?!, VY?, and V? all satisfy the usual time-homogeneous Black- 
Scholes partial differential equation (BSPDE) in the variables Sọ, 7 with appropriate boundary 
values in Sọ and whose value at zero time to maturity is determined uniquely by the pay-off 
(and the barrier level with respect to Sọ in the case of the barrier options). This follows, since 
one can interchange the order of taking partial derivatives in Sọ, T with integrating over S, and 
using the fact that e~’’U) and e~’7U solve the BSPDE in Sp, T (for fixed S) with appropriate 
boundary conditions in Sọ and delta function value at zero time to maturity. ] 

Recall from contract definitions that the knock-in options have zero value unless the 
asset price S, attains the barrier at a time before maturity time T, upon which the option 
immediately becomes the plain European. The foregoing symmetry relation follows from the 
fact that the knock-in solution is expressible as a linear combination of the knockout and 
barrier-free solutions. The unique combination then follows by satisfying boundary conditions. 
In particular, VP! = V—V° since at the barrier the knock-in must have the same value as 
the plain option: V”! (Sọ = H, T) = V(S,) = H, 7) for all nonzero times to maturity. Also, at 
the other boundary, Sọ = 0, the two option prices must both equal zero. Finally, at maturity, 
VPI (So, T= 0) = V(Sp, T =0)— VP? (So, T= 0) = 0 since V and V”? are equal for all Sọ > H, 
at zero time to maturity. This last property (i.e., the initial condition T = 0) must be satisfied 
since the asset price starts above the barrier and stays there, hence the barrier is never attained, 
giving zero value for the knock-in. A similar argument applied to the up-and-in option also 
leads to the foregoing symmetry. 

We now provide the derivation of exact pricing formulas for single-barrier European calls 
and puts. As a first example, we consider a down-and-out call with strike K and barrier level 
H. In this case A(S) = (S — K), and equation (3.41) gives 


C°(H, Sy, K, 7) = e77 f U(H, S, So; )(S— K)dS, (3.44) 
B 
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where B= H if H > K and B= K if H < K. In general, it proves useful to evaluate the two 
integrals defined by 


o(B) = f ” UCH, S, Sp: dS (3.45) 
and 
(B) = Í ” U(H, S, Sy; DSdS (3.46) 


for any B > 0. Using equation (3.40) with u = r, and changing integration variable S = e”, 
we have 


S 2r 
(B) = i , [80-402 (log So; 1) — (H/So) "80,102 (x, log(H?/Sp); 7) Jax. 
og 


This integral is evaluated using steps similar to those in previous derivations of the Black— 
Scholes formula for a plain call. In particular, using equation (3.28) gives 


oo X 1 52)7)2 /202 
e7 Olog So (r— 3.0 )1)*/20 Tdx 


1 
OV 27T “log B 
log(Sp/B)+(r— 4.02)r 
1 rr ne oe 
= —— e? dy 
V 2T I- 
2 v(a. 


(*)). ean 


_ logxt(r+3o’)t 
> OyT 
where d_(x) = d,(x) —@,/7. The second line in equation (3.47) follows simply by a linear 


change of variables x = log Sy + (r — 50°) T — o./Ty. The second term in (8) is integrated 
in identical fashion, with Sọ replaced by H?/Sp, and combining gives 


8o,r- 102 (x, log So; T)dx = 


log B 


where here and throughout we define 





d(x) 





(3.48) 





wO E 0 


The integrand for the P(B) integral is similar, except for an extra e* factor. Upon completing 
the squares in the integrand exponents and using similar steps as before, one readily obtains 


o(B) = e|son(a, (3)) = (=) "aa (3.50) 


From equations (3.44), (3.45), and (3.46), CP? (H, Sy, K, T) = e~”"[b(B) — K@(B)]. Hence 
plugging the value B= H if H > K (B =K if H < K) gives the exact pricing formula for 
the down-and-out option in terms of cumulative normal density functions: 


enson) E (08) 
- xerw(a_(%)) rre (E) 'w(a_(2)) 6.51) 
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for H > K, and 


cnunsonr-su(e.(9))-4(8)" C) 
-ser(e(Q))en E) e 


for H < K. Note that for the case H < K, one also has the compact form in terms of plain calls: 
C°(H, Sy, K, 7) = C(So, K, 7) — (H/S) # CCH? / Sa, K, 7). (3.53) 


From the symmetry C”! + C?? = C, this expression gives the down-and-in value C”! explic- 
itly. Rearranging equation (3.51) we can also extract an exact expression for C”! when H > K. 

The down-and-out put value PP? = 0 for H > K since the put payoff (K — S), is zero in 
this trivial case. Symmetry then gives P”! = P, the plain European put value. In contrast, the 
case H < K gives 


P?°(H, Sy, K, 7) ser f" U(H, S, So; T)(K — S)dS 
= e" [K($(H) — $(K)) + $(K) — $(H)] 
Ae OE EE) 
ES EE 
EE) 
EE o 


By using C(So, K, T) = SoN(d}(So/K))— Ke" N(d_(So/K)), the property N(d4(So/H)) = 
1— N(—d,(So/H)), and put-call parity for the plain call and put option price, this result is 
also expressible as 








P?°(H, Sy, K, 7) = P(Sp, K, 7) — P”! (H, Sy, K, 7), (3.55) 


P?!(H, Sy, K, 7) = -sn(- s(a) +rern(- a (3)) 
A PDE] 
A a l 


is the value of the down-and-in put. Note that for H = K these expressions give 
P”! (H, Sy, K, T) = P(So, K, 7), the plain put value, and PP? = 0, as required. 


where 
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Up-and-out calls and puts are obtained using equation (3.42). For a put we have 
B 
P¥°(H, Sy, K, 1) = e" f U(H, S, Sy; 7)(K — S)dS 
0 
B B 
=ke" f uds—e" f usas 
0 0 


= Ke~"[6(0) — 6(B)]+[4(B) — 6(0) Je-” 


EO *-()) 
veefa(-«(8))-(" 


where B = H for H < K and B= K for H > K. Here we have used the properties N(d,(00)) = 
N(co) = 1 and 1 — N(x) = M(—x). Substituting B = H or B = K then gives the exact expres- 
sions for the up-and-out put: 


0 — So =rT So 
P”? (H, So, K, n=-sn(-a,($)) + Ke w(-a-(3)) 
<2 ar ae 
r eE) 


ore] 


Ke"()" nla (2 (3.57) 
— Ke z = _ i 

So So 
for H < K, and 


P (A, S, K T) = P(S K T) + S, = A i -d 2: 
’ oi, oA, + 
? So SoK 


OET 


for H > K. The exact expressions for the up-and-in put follow simply by symmetry, P” = 
PP. 
From equation (3.42), the up-and-out call is given by 














H 
C!°(H, Sy, K, 1) =e" f U(H, S, Sọ; (S — K), ds. (3.59) 
0 


Since the payoff function is zero for S < K, CY? = 0 for H < K. For H > K the option value 
can be rewritten as 


K 
CH, Sọ, K, 1) = e~" Í U(H, S, Sy; T)(K — S)dS. (3.60) 

H 
As observed from equation (3.54), this is precisely the value of the down-and-out put option 
for H < K. By extracting out the plain call value C(Sọ, K, 7) from the last expression on the 


right-hand side of (3.54), the result can be recast as 


C° (H, Sy, K, T) = C(Sy, K, 7) — C” (H, Sy, K, 7), (3.61) 
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with up-and-in call option value 


C" (H, Sy, K, 7) = sw (a, 2) -xerw(a_(3)) 
95)" [e(@(§))-"(“(Se) 


[HNŽ H H? 
A n a 6 
for H > K. For H = K we have C" = C, the plain call value, and CY? = 0, as required. 
All of the preceding analytical pricing formulas for geometric Brownian motion are 
also readily extended to the case of a time-dependent barrier that has an exponential form 
H(t) = He% , with a, H as constants. Assuming the choice œ > 0, the barrier boundary is 
an increasing function of calendar time (or decreasing function of time to maturity 7). For 
a given 7, the solution domain for the underlying asset price is [H(7), 00) for a down-and- 
out and (0, H(7)] for an up-and-out. Assuming geometric Brownian motion as before with 
constant drift u and volatility ø, the single-barrier kernel for this exponentially shaped barrier 
with zero-boundary condition at the 7-dependent boundary level Sọ = H(t) (and at the other 
endpoint Sọ = 0 or Sọ = œ) is denoted by U#® (S, Sy, m; T). [Note that we use a notation 
involving the explicit functional dependence on the drift parameter needed to precisely clarify 
the arguments that follow.] It can be readily shown (see Problem 3) that this kernel is given 
by the constant barrier kernel in equation (3.40), now denoted by U(H, S, Sy, M; T), where we 
replace the arguments Sọ > Sọe™ and u —> u — &. That is, 


UO (S, Sy, u; T) = U(H, S, Sye*”, w— a; T). (3.63) 


The risk-neutral pricing kernel for the exponential barrier with lognormal drift u = r 
(the assumed constant interest rate) is then explicitly given by 


UFO (S, So r; T) = k E-(- FP 2071 





1 
OSV 27T 
27-0) sf 


o 7) E 
Pan (= ) e [log Š (2log o H-S oe . (3.64) 
0 








Setting œ = 0 obviously recovers the previous risk-neutral density for the case with constant 
barrier. 

Exact pricing formulas for European knockouts and knock-ins for exponential barriers can 
be obtained by integrating the density given by equation (3.64) and following similar steps 
as were used earlier for the case of a constant barrier. However, a straightforward approach 
is to make use of relation (3.63) directly in the risk-neutral pricing formula. Consider a 
down-and-out with payoff A(S): The risk-neutral price is 


V°°(H(1), Sy. r, T) =e" Í U#O(S, Sa, r; DA(S)dS 
H(7) 


= ere f UCH, S, Spe", r— a; T)A(S)dS 
H 


=e VPO (Spe, H, r— a, 7), (3.65) 
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where V?°(S,e%’, H, r— æ, T) is the value of the down-and-out with spot S,e°’, constant 
barrier level at H, effective interest rate r— a, and time to maturity 7. Similarly, an up-and-out 
has value 


VYP (H(T), So, r, T) = eV" (Spe, H, r— a, T). (3.66) 


The corresponding prices of the knock-ins obtain simply from knock-in/knockout symmetry. 
Since the barrier-free pricing kernel (3.39) satisfies the invariance relation Up(S, So, r; T) = 
U,(S, Soe% , r — a; T), the plain-vanilla price satisfies 


V(So, r, 7) =e “7 V(Soe™, r— a, T). (3.67) 


Given a pricing formula for the constant barrier case, the corresponding pricing formula 
for the exponentially shaped barrier follows from equation (3.65) or (3.66). For example, 
applying equation (3.65) to equation (3.53) gives the exact price of a down-and-out call with 
exponential barrier for H(T) < K as 








ar-a) | 7 
C?° (H(t), So, K, T) = C(So, K, 7) — (=) í (= ,K, r), (3.68) 


where equation (3.67) has been used on the two plain calls. Analogous formulas for the other 
types of knock-in and knockout barrier options discussed follow in similar fashion. 


Problems 
Problem 1. Show that the function V(So, T) = seV(ash , T) satisfies the Black-Scholes 


equation 


Vv 1 a _ av 
= 0S S V, 3.69 
a 27 age aa oe) 





where V(So, T) is assumed to satisfy the same equation in the (Sy, 7) variables, Sọ = as , 
and provided we make the parameter choice a = 1 —2r/o’, B = —1, for arbitrary nonzero 
constant a. Then consider expressing the price of a down-and-out call struck at K, with 
constant barrier at H < K, as a linear combination of two solutions using plain calls 


_2r 
CP? = C(So, K, 1) + bSy ” Cla/So, K, 7). (3.70) 


Determine the constants a and b by satisfying the zero-boundary condition at the barrier 
So = H and the initial condition C?° —> (Sọ — K), as T > 0, hence arriving at (3.53). 


Problem 2. Derive the greeks A, T, © (as defined in Chapter 1) for the down-and-out call, 
with value V = C?° given by equation (3.53). Is the relationship © = o° ST + r(S)A— V) 
satisfied? 


Problem 3. Consider the exponential barrier H(t) = He% , with H and a as constants. Let 
U(S, So; T) = U"(S, Sy; T) be the pricing kernel solving 


a 2U 
= o’ S? 5 
ðr 2 as? 





aU 
+ So 


3.71 
aS, (3.71) 
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and satisfying U(S, Sy = H(1); T) = 0 and U(S, Sy = œ; T) = 0 for the fundamental solution 
in the upper domain H(T) < Sọ < œ or U(S, So = 0; T) = 0 for the case of the lower solution 
domain 0 < Sy < A(z). Let U(S, So; 7) = U(S, S; T), S = Soe, and show that U solves 

PU 


U - ðU 
aang a)S—, 3.72 
T (u= a) a (3.72) 


aU 
ar 
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with U(S, S = H; T) = 0 and U(S, S = œ; T) = 0 for the fundamental solution in the upper 
domain H < S < œ or U(S, S= 0; T) = 0 for the case of the lower domain 0 < kY <H., 
Hence U(S, S; T) = U(H, S, Sye*”, u — a; T), with the function U given by equation (3.40) 
for constant barrier level H and drift u — a (in the place of u), and conclude that the kernel 
U#™ for the time-dependent exponential barrier with drift u is given by equation (3.63), 
while setting u = r gives equation (3.64). 


3.4 First-Passage Time 


When pricing exotic barrier options it is useful to consider the first-passage time of a diffusion 
process, i.e., the first time at which a process achieves a particular value or enters (exits) a 
region. In particular, for the sake of pricing, we are interested in the first-passage time for 
an asset price process crossing a specified constant barrier level H > 0. We hence consider 
calculating the probability distribution for the first-passage time, the time taken to attain the 
absorbing barrier. Consider the case of an upper barrier with current asset price Sy < H, and 
let t— tọ = T > 0 be the amount of time spent from current time tọ until the barrier is first 
attained at time t. Then 


H 
®(H, S 7) = 1— f U(H, S, Sy; T)dS (3.73) 
0 


represents the probability (cumulative in the passage time 7) that the asset price process 
has attained the upper barrier H and has been absorbed. Indeed, this is just 1 minus the 
probability that the asset price remains below the barrier, or, equivalently, ® is the probability 
of absorption. If we denote 7, = min{7; S, > H, Sọ < H} as the first-passage time random 
variable, then ®(H, Sy, 7) is the probability P{T, < 7}. The function U(H, S, Sọ; 7) is the 
kernel for the solution region [0, H] with absorbing boundary condition at the barrier. [Note 
that although we are considering a time-homogeneous process, with state-dependent drift 
and volatility functions, the formal theory extends in the obvious manner for the general 
case of a time-inhomogeneous process, where we would consider a kernel U(H, S, t, So, to) 
having explicit dependence on t and tọ rather than T = t — t).] As T — 0, the integrand gives 
a Dirac delta function contribution 6(S — Sọ) in the region [0, H] and hence integrates to 
unity; therefore ®(H, Sy, T = 0) = 0. Since U(H, S, So; T) is identically zero for Sy = H, ® 
has boundary condition ®(H, Sọ = H, 7) = 1. Moreover, U is a kernel and hence obviously 
solves both forward and backward Kolmogorov equations for the asset price diffusion process. 
Since partial derivatives with respect to Sọ and 7 can be taken inside the integral, the 
cumulative probability density for the first passage time, ®, is therefore a solution of the 
time-homogeneous backward (and not the forward) Kolmogorov partial differential equation 
in So, T subject to the foregoing conditions. 

The other case, where H is a lower barrier with current asset price Sọ > H, is similar, 
with equation (3.73) replaced by 


®(H, Sy, 7) =1- Í U(H, S, So; 7)d5, (3.74) 
H 
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which is the cumulative probability that the asset price process has attained the lower barrier 
and has been absorbed, where U is the kernel for the solution region [H, oo) with absorbing 
boundary condition at the barrier. The first passage time random variable is now the stopping 
time T, = min{7; S, < H, Sy > H}. From similar arguments as before, one again obtains that ® 
solves the same backward Kolmogorov equation with unit boundary condition at the barrier, 
®(H, Sọ = H, T) = 1, and zero initial condition ®(H, Sy, T = 0) = 0. 

In both cases, the function ® can be obtained by solving the backward Kolmogorov 
equation subject to the stated conditions. However, given the kernel U, @ is simply determined 
by an integration via equation (3.73) [or (3.74)]. If ® is a cumulative function, the probability 
density function f for the first passage time must be given by differentiation: 


d®(H, So, T) 


0 
SCA, So, T) = ag tt <= = 


(3.75) 


For f to be a bona fide probability density, it must be strictly nonnegative and must integrate 
to unity over all positive 7. Integrating 


œ (H, So, T) d 


T = O(H, Sy, œ) (3.76) 
OT 


f ra, So, adr = f 


hence gives ®(H, Sọ, co) = 1 as the latter condition. This is not generally satisfied, as we 
shall see next for the specific case of geometric Brownian motion. Since the integral in 
equation (3.76) gives the probability that (given any amount of time) a path starting at Sọ will 
eventually be absorbed at the barrier, this quantity is generally less than or equal to 1. The 
condition of nonnegativity of f, however, can be shown to follow for quite general processes 
(see Problem 1). 

For geometric Brownian motion with drift r and volatility o it is a simple matter to obtain 
exact formulas for the first-passage densities based on the exact kernel in equation (3.40). 
In particular, the integrals in equations (3.73) and (3.74) are given by direct use of equa- 
tion (3.49), giving 
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for Sọ > H and 
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for Sy < H. Hence, for Sọ > H we obtain the limiting value of equation (3.77): 


1 
( 1 r< 77 
(AH, So, œ) = oi (3.79) 


H\2 Ds 
— ; r>=0o. 
So 2 


upon using M(co) = 1, N(—oo) = 0. Hence, if r < io’, the cumulative density approaches 
unity in the infinite-passage time limit so that, with probability 1, absorption eventually 
occurs. On the other hand, if r > io”, the cumulative density approaches a number strictly 
between 0 and 1, since H/Sọ < 1 and a — 1 > 0, so the probability of eventual absorption is 
less than 1. In contrast, taking the infinite time limit of equations (3.78) gives, for Sy < H, 


2 


>, 


1 
1, r>-0 
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O(H, So, œ) = Ee i : (3.80) 
— 5 r< =o. 
So 2 


In this case, the reverse is observed, whereby the density approaches unity only if r > 1o? 
and otherwise approaches a number strictly between 0 and 1. A basic interpretation of this 
is that eventual absorption will take place with certainty only if the effective drift, which is 
given by r— 50", is not positive (or not negative) if the process starts above (or below) the 
barrier. By differentiating equations (3.77) and (3.78) and combining, the exact first-passage 
time density can be written as a single expression: 


[log Fl —[log(Sp/H)+(r— }0?)7]?/20? 
f(A, So, 7) = Beene ae eres (3.81) 


OT? PN 2T 


for all So, H > 0. 

The first-passage time density is useful when pricing “pay-at-hit one touch” type of 
options or for pricing barrier options that also provide a rebate payment to the holder once 
the barrier is hit. In the case of a down-and-out option with a rebate, equation (3.41) becomes 


VPO (Sy, T) =e" Í, U(H, S, Sy; DA(S)dS + [ e-" R(t —1) f(H, Sos t)dt. 


The time integral term is just the expected present value of the rebate, whereby discounted 
payments occurring at an elapsed time ¢ in the future from the present are weighted with 
the first-passage time density for hitting the barrier after time t. The time-dependent rebate 
function is here assumed to be a function of the time remaining to maturtiy. 

The first-passage time is also a very useful tool for computing options whose prices 
depend on stopping times that can be interpreted as first hitting times. Nice examples of such 
options are the American digitals. We have already discussed the payoff structure of European 
digitals. The Black-Scholes price of European digitals is simple to obtain (see Problem 8 
in Section 1.8). The pay-off of an American digital is similar — the holder of an American 
digital receives one dollar if, and at the first time that, the underlying stock price hits the 
fixed strike level K. Since the option expires with a pay-off to the holder at the instant the 
spot hits the strike level, the early-exercise boundary, as such, is trivially fixed at the strike K. 
The time optionality in this case is simpler than in the standard American contracts (e.g., a 
put or call with dividend, etc.) studied in Chapter 1, where the early-exercise boundary is 
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moving with time. The optimal stopping time is, in this case, just the first hitting time 7 
such that S, = K. Once a hitting time 7 occurs, the contract expires, paying one dollar at 
time 7, and the value of that cash flow is the discounted value of one dollar, i.e., e~". Given 
the probability density f(H, Sọ, T) for the first hitting time as provided by equation (3.75), 
the fair price of the American digital at time tọ = 0, with maturity T, spot Sọ, and strike K, 
reduces to a time integral: 


T _,OP(K, So, T) fs 


T 
Coe T) = [ e™™ f(K, So, T)dt = [ e = (3.82) 


Closed-form analytical expressions can therefore be derived assuming a geometric Brownian 
motion model (see Problem 4). 


Problems 


Problem 1. Consider a process with state-dependent drift w(S) and volatility o(S). Argue 
that the first-passage time density for either lower or upper barrier case is strictly nonnegative. 
In developing your argument, consider the derivative with respect to r of ® defined via 
equation (3.73) [and (3.74) separately] and make use of the forward equation for the single 
barrier density U = U(H, S, So; T): 


el = i (28w) = (usu). (3.83) 


Integrating over S and assuming u(S)U and o?(S)U satisfy zero-boundary conditions at 


the endpoints, arrive at the expressions 
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SCA, So, 7) = +50°(H) 
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where the plus sign is for Sọ > H and the minus sign is for Sy < H. Using the fact that the 
kernel is a positive differentiable function of S within either solution interval and has zero 
value at the barrier endpoint, further argue that 


1 ðU 
SCA, So, T) = 5ean) 35 (3.84) 





which is hence strictly nonnegative. 


Problem 2. Using the kernel in equation (3.40), give an explicit verification that equa- 
tion (3.84) gives the exact first-passage time density in equation (3.81) for geometric Brownian 
motion where u(S) = rS, a(S) = oS. 


Problem 3. Assume the exponentially time-dependent barrier of the previous section, H(T) = 
He~°*", a > 0. Show that the first-passage time density for Sọ > H is given by 


s 
log HO A 


OT? N 2T 


f(A, Sy, 7) = log(Sp/H(s))+(r—a— 3.07)? /2077 (3.85) 


Problem 4. Obtain an analytical pricing formula for an American digital within the geometric 
Brownian motion model. 
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3.5 


3.5.1 


Pricing Kernels and Barrier Option Formulas for Linear 
and Quadratic Volatility Models 


The kernels for the Wiener process obtained in the previous sections are readily used as a 
basis for providing other exact kernels and barrier-pricing formulas for affine and quadratic 
volatility models. The formulas follow as a simple consequence of a more general method, 
which we coin as the diffusion canonical mapping reduction methodology. This mathematical 
framework is presented in detail later in this chapter. In particular, it provides a precise 
relationship between the transition probability density or pricing kernel U(F, Fy; T) for the 
space of a process F, and the transition density u(x, x); T) for a process x, under a sim- 
pler diffusion. In this section it suffices to consider x, as the pure Wiener process. The 
process F, represents an underlying asset price, such as a forward price at time t. Hence, 
given an exact kernel for the simpler x-space process, we show how the mapping reduction 
method automatically provides the desired exact pricing kernel for the more complicated 
F-space process. Moreover, the desired boundary conditions in F-space (e.g., in the desired 
asset price space) are satisfied by mapping onto the corresponding boundary conditions in 
x-space. 


Linear Volatility Models Revisited 


Although we have already dealt with the linear volatility model (i.e., the standard Black- 
Scholes model) in great detail in previous sections, it is instructive to see how the solutions 
to the linear volatility model also arise as a very special case of the diffusion canonical 
mapping reduction method, wherein the underlying x-space process is the simple Wiener 
process. In particular, assume the two processes satisfy dx, = /2 dW, and dF, = o(F,)dW,, 
under appropriate respective measures, where the F, process is considered to have zero 
drift and linear volatility function o(F) = oF, o = const. The (forward price) space of F 
values is mapped one to one onto the entire space of the Wiener process with the variable 
transformation 


x = X(F) = (V2/a) log F (3.86) 


with inverse F = F(x) = e®/¥?. Since a = = the transformation reduction equation (3.259), 
of Lemma 3.1 to be derived in Section 3.8.1, specializes to give 





v2 Ll R P 
U(F, Fo; T) = a exp | log F 7 r [UX E), XE: T) 
2 JF 2 
_ V2 [E e eeycx(r), XF): D). (3.87) 
o V F? 
Here we have used a = a,_,, = —0?/8, which results from equation (3.257) while substi- 


tuting for the x-space volatility function (as constant) v(x) = V2 and drift A(x) = 0. At this 
point the reader should note that the two transition probability densities U and u are not just 
simply related by a change of variables (i.e., the two functions are not the same probability 
densities expressed in terms of two different variables), but rather also involve the exponential 
multiplicative term due essentially to a measure change. This point will become clear later 
in this chapter when we come to discuss the mapping reduction framework in general. The 
mapping x = X(F) and its inverse is monotonically increasing, with domain x € (—oo, 00) 
mapped onto F € (0, œ). By direct substitution, while changing variables of differentiation 
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and using equations (3.7) and (3.8), the reader can readily verify that U = U(F, F,; T) in equa- 
tion (3.87) indeed satisfies both forward and backward equations: dU/dT = LPF 2U) /OF? 
and dU/dt = 10°F? &U/dF5. 

Equation (3.87) gives an exact relationship between a kernel U for the linear volatil- 
ity model and a kernel u for the Wiener process. The unique pricing kernels for the 
barrier-free case as well as for the case of single and double barriers then follow auto- 
matically by substitution of the particular kernel u that satisfies the appropriate boundary 
conditions. For the barrier-free case, the zero-boundary conditions U(F = 0, F}; T) = U(F = 
oo, Fy; T) = 0 (with the same conditions also holding in Fo) are satisfied by substituting the 
solution u(x, Xo; T) = go(x, Xo; T) of equation (3.9) into equation (3.87). Upon using equa- 
tion (3.86) and completing the square in the exponent, one obtains the zero-drift lognormal 
density 





UF, Fy; 7) =— | (1 ae Jr | (3.88) 
, Fos T) = ex o T oT |. ; 

° OFN 2TT i 2 

As required, this formula is consistent with equation (3.39), where S, = e" F, or, alternatively, 
with the case of zero drift u = 0, with S = F, Sọ = Fy. A barrier level at F = H (or Fy = H) 
corresponds to H = F(x,) = e®#/¥?, so xy = X(H) = (V2/o)logH. Hence the lower- 
region F, F} € (0, H| maps onto x, xọ E€ (—œ, xy], whereas the upper-region F, Fy € [H, œ) 
maps onto x, xo € [xy, 0). The density for a single absorbing barrier at F, Fy = H is hence 
obtained by simply substituting the kernel u(X(F), X(Fo); T) = 9"(X(A), X(F), X(Fo); 7) of 
equation (3.14) into relation (3.87), giving: 








U(A, F, Fo; T) = ue exp E log f 2 r leran, X(F), X(Fo); 7) 
= U(F, Fy; i exp | ese) |). (3.89) 
o77/2 


with U(F, Fj; T) given by equation (3.88). Note that this gives (absorbing) zero-boundary 
conditions U(H, F = H, Fy; T) = U(H, F, Fy = H; T) = 0 and that equation (3.89) is exactly 
consistent with equation (3.40) when u = 0. 

Exact analytical expressions for single-barrier options follow from the kernel in equa- 
tion (3.89). Ignoring discounting,’ an up-and-out European-style option expiring worthless if 
the upper forward price barrier F = H is crossed before a time to maturity 7, with current 
(forward) price level Fy € (0, H), has a price given by [in direct analogy with equation (3.42)] 


V(F,, 7) = f " UH, F, Fy. (FF, (3.90) 


where A(F) is an assumed payoff function. The corresponding down-and-out option with 
Fo > H has price [in direct analogy with equation (3.41)] 


VOR, 7) = Í U(H, F, Fy; 1) A(F)dF. (3.91) 

H 
The knock-in barrier option prices are obtained from (knock-in)-(knockout) symmetry as 
discussed in Section 3.3. The plain-vanilla option price follows by integrating the barrier-free 


kernel (3.88) against the payoff function: 


VF.) = f ” UCP, Fy; )A(F)AdF. (3.92) 


3 Throughout Section 3.5 we shall simply omit the overall discount factor in all the option-pricing formulas. 
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We shall not repeat the explicit intermediate steps in the derivations of the single-barrier 
European call and put pricing formulas since the procedure follows in exactly the same 
manner as discussed in Section 3.3. Call and put pay-offs with forward price struck at K are 
assumed to have payoffs A(F) = (F — K), and A(F) = (K — F),, respectively. The integrals 
in equations (3.90) and (3.91) are then readily evaluated by considering the analogues of 
equations (3.45) and (3.46); now defined by: 


o(B) = f UH. F, Fy AF, (3.93) 


(B) = [ ” U(H, F, Fy )FdF (3.94) 


for any B > 0, with U(H, F, F); T) given by equation (3.89). In particular, the price of a 
down-and-out call on the underlying forward price struck at K with single barrier at forward 
price level H is given by (ignoring discounting) 


C?°(H, Fy, K, T) = (B) — Kẹ (B), (3.95) 


where B = H if H > K and B =K if H < K. Exact expressions for ¢(B) and $(B) follow 
from equations (3.49) and (3.50) with r = 0 and Sọ = F: 


oE) 
rona E o 


4 de 
logx+ 50°T 
O/T 


Hence setting B = H if H > K (and B= K if H < K) gives the exact pricing formula for the 
down-and-out call in terms of cumulative normal density functions: 


C?°(H, Fy, K, T) = run (a,(32)) = an(a,(=)) 


Fo KF, H 
— KN{ d_{| — | |+| — IN| d_| — (3.99) 
H H Fy 
for H > K; and for H < K, 


C?°(H, Fy, K, T) = ron (a,(2)) - an(a,(72)) 
- xn(a(#))+“F0(a (2) (3.100) 


All other cases of single-barrier calls and puts are derived in the same manner, as described 
in detail in Section 3.3. The exact pricing formulas are obtained by simply setting r = 0 and 
So = Fp in all of the option-pricing expressions in Section 3.3. This is indeed not surprising, 
since discounting is ignored and the underlying is a forward price, as is the barrier level. 


where 





` 


+(x) = (3.98) 
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To complete this section we consider the problem of pricing a European double-knockout- 
barrier option on the underlying asset price process F,. This option expires worthless if at 
any time before maturity the underlying price attains either barrier at L or H with L < H. 
The transition density in this case must have absorbing (i.e., zero) boundary conditions at 
both finite barrier endpoints. These points are mapped into the x-space endpoints: xy = 
X(H) = (V2/o) log H, x, = X(L) = (V2/a) log L. In virtue of the general relationship given 
by equation (3.87), the F-space density follows by simply substituting the x-space transition 
density satisfying zero-boundary conditions: u(x = x,, Xo; T) = u(x = Xy, Xo; T) = 0. The 
problem is hence again reduced to finding u(x, x; T). This density is readily obtained as 
an exact series expansion in sine functions via the method of eigenfunction expansions. 
This method and its relation to the Laplace transform technique for solving the Kolmogorov 
equations subject to different types of boundary conditions is generally described later in this 
chapter, i.e., where the method of Green’s functions is discussed. Here we simply state the 
result (see Problem 1 of this section for an alternate derivation): 








2 cao nT(xXo— x nmt(x—x 
y: e Pat sin ( 0 L) sin ( L) ; 


XH T XL n1 XHT XL XHT XL 


(3.101) 


u(x, Xp T) = 


for x, xo € [x,, Xy], where p, = n?m? /(xy—x,)*. [In mathematical physics, this is the well- 
known Fourier sine series solution to the simple heat conduction problem for an initial point 
source of heat diffusing on a finite one-dimensional domain (e.g., a rod) with insulation at 
both endpoints.] This series converges for all positive 7 and gives a representation of the Dirac 
delta function 6(x — x9) for the finite domain [x,, xy] when 7 = 0. Inserting equation (3.101) 
into equation (3.87) while using equation (3.86) gives the transition density satisfying double- 
barrier zero-boundary conditions at L and H, denoted by U”?, for the linear volatility model 
as an exact series: 


Fo 


U?®(F, F,; 7) = — ,/— 
( ) log # F? 


5 e ®7 sin (nTy(F,)) sin (n7y(F)) (3.102) 


n=1 


for Fy, F € [L, H], where 


X(F)—X(L) _ log? 








F)= = i 3.103 

nP) = XO) log # oe 
o? o nmo 

=— +p, = — 3 3.104 

P= g thg tiog” (3.104) 


This series is easily shown to converge for all positive 7 values and gives a representation 
of the Dirac delta function 6(F — Fy) for the finite domain F, Fy € [L, H] when 7 = 0. 
[Note: 6(F — Fy) = &6(X(F) — X(Fo)) = 2§(X(F) — X(Fy)).] Of practical importance is 
the fact that the convergence of the series is fairly rapid since the eigenvalues p,, grow as n? as 
n increases; i.e., contributions from the higher-frequency sine functions are diminished by the 
dominant factor e~?’"", which decreases rapidly as a Gaussian function in n. Although more 
terms are required to achieve the same level of accuracy as the time to maturity is decreased, a 
uniformly high level of accuracy (and positivity in the density) can be achieved by retaining a 
relatively small number of terms in the sum (see Figure 3.4). Moreover, similar expressions (as 
demonstrated next) for pricing double-barrier options require a substantially smaller number 
of terms for high accuracy. A double knockout European-style option maturing in time T is 
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FIGURE 3.4 Uniform convergence of the density given by equation (3.102) for L = 10, H = 50, 
Fy = 20, o = 0.2, T = 0.1. The three curves correspond to using the first 10 (dashed line), 20, and 
30 terms (the thick solid line) in the series sum. 





then priced by taking the expectation of the payoff A(F’) over the allowable region (ignoring 
discounting): 


VPB(F,, 7) = | " UPECF, Fy; )A(P)dF. (3.105) 


For example, a double knockout European call struck at K is priced by inserting UP? (F, Fy; 7) 
with A(F) = (F — K), and integrating, term by term, in the series to obtain exact analytical 
series expressions for the option value. In carrying out the integration it is very convenient 
to change integration variables F — x as defined by the original variable transformation F = 
F(x) = e”/¥, i.e., using the F-space density F’(x)U?8(F(x), Fy; T) expressed as a function 
of the x variable (see Problem 2). Two separate formulas arise accounting for whether K < L 
or K > L (in both cases K < H; otherwise the strike is above the upper barrier and the option 
is worthless). For K > L the formula for the call is 


P 





o? 2. a Pany. 
C (Fy, K, 1) = zaV sin (amy(Fo)) 
Z L n=1 Pn 





È K-H 
x 


ee sar a anrm(8) (3.106) 


and for the case K < L is 


2 











© oP nT 
CF. K D) = VF on —— sin (nm yC) 
og L n=1 Pn 
K-dH K-L 
-| Ti i TE | (3.107) 


where y(-) and p, are given by equations (3.103) and (3.104). Note that the two expres- 
sions are equivalent when K = L. Figure 3.5 displays a typical convergence when applying 
equation (3.106). 
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FIGURE 3.5 Uniform rapid convergence of the value of the double knockout call as more terms are 
used in the series formula for C22, where L = 10, H = 50, K = 20, ø = 0.2, tT = 0.25. The five 
separate curves correspond to the truncated series sum in the first 1, 2, 4, 8, and 12 (solid line) terms 
of equation (3.106). 





Problems 
Problem 1. We wish to solve 
Ou _ u 


ar Ax?’ 





subject to u(x = xz, T) = u(x = xy, T) = 0 and initial condition u(x, T = 0) = up(x). Since 
the solution must vanish at the endpoints of the interval [x,, xy], one method is to express u 
as a Fourier sine series: 
2 . nm(x— x) 
u(x, T) = X` b, (7) sin ————,, (3.108) 
n=1 


XHT XL 


with coefficients b„(T) depending only on 7. Using direct substitution and by satisfying the 
initial condition, show that 


b,(7) = a,e (3.109) 
where p, = 1? T? /(x;—x,)? and where a, = b,,(0) is 


2 XH = 
a, = — | Pa ad cll ae (3.110) 


" Xy Xr Sx XH XL 
Hence, recover equation (3.101) when u(x) = 6(x — xo). 


Problem 2. From equation (3.105) we see that the double-barrier call option can be derived 
by computing the integrals 


(K) = J U?’ (F, Fy; T)dF, (3.111) 


$(K) = i UP? (F, Fy; 1)FaF, (3.112) 
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3.5.2 


for L < K < H. By using equation (3.102) and a change-of-integration variable F > x defined 
by F =e! v2, show that the latter integral is 


TEE D es [ara - X(L)) 
P iE > ze i a sn Sx | 


where X(-) is defined by equation (3.86). Apply the indefinite integral identity 
fe“sinbx dx = e*[asin bx — bcos bx]/(a’ + b’) +c, where a, b, c are any constants, 
and obtain 





$(K) = TO aa ER (n-y(F))| -nova 














+ e cos (n7y(K)) — a sin (7x) ) |. (3.113) 
Using a similar procedure, obtain 
$(K) = Z er 2 — sin (ny(Fy)) | = “ 
+ az (r cos (nTy(K)) + a sin (7x) ) |. (3.114) 


Using C28 = ġ(K)— Ko(K) for K > L and C?? = $(L) — K@(L) for K < L, obtain equa- 
tions (3.106) and (3.107). 


Problem 3. By computing the integrals 
K 
®(K) = / U”*dF (3.115) 
L 
and 
£ K 
(K) = Í UPE FdF (3.116) 
L 


and using steps similar to those in Problem 2, derive an exact series expression for the 
corresponding double-barrier put option value PP? (F), K, T) for strike K, L < K < H. 


Quadratic Volatility Models 


We now consider the problem of pricing European options, including barriers, for the more 
complex quadratic volatility model with two distinct roots:4 





olF) ae Cg OC Ee OP (3.117) 
(FSF) 


4A quadratic volatility function is generally of the form o(F) = o9(F — F) (F — F), where F, F are two real 
roots. Here, we find it useful to express the nonzero parameter as a ratio 0) = o/(F — F). In this way, the parameter 
@ corresponds to the volatility parameter in the linear model in the limit F —> oo. 
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FIGURE 3.6 Example of a quadratic volatility function with two distinct roots F = 5, F= 100, 
go =0.2. The linear volatility function for given parameter ø is drawn for direct comparison. The linear 
model obtains in the limit F > oo. 


Figure 3.6 depicts the shape of a quadratic volatility function in comparison with an affine 
(linear) model o(F) = o(F — F), for given volatility parameter a. Without loss in generality, 
throughout we assume F < F, with underlying asset price F € [F, F]. We note that the separate 
case of the single double-root quadratic model is discussed later in this chapter. This model 
is a special case of the constant-elasticity-of-variance (CEV) model, which itself is shown to 
obtain as a special case of a more general Bessel family of solutions. These more general 
families of exact solutions are discussed later in this chapter. Here, we consider obtaining 
solutions to the model in equation (3.117) by mapping the (forward) price space F onto the 
x-space of the Wiener process. That is, the zero-drift Wiener process with constant volatility 
v(x) = /2 can again be chosen as underlying process in x-space and thereby ultimately 
provide exact solutions to the quadratic volatility model in F-space. As in the linear model, 
we have the constant œ = —o?/8. (Note: This is the constant a, ,, corresponding to the 
diffusion canonical transformation x —> F described later in the chapter.) The transformation 
of x = X(F) is defined by equating the Jacobian of the transformation to the ratio of the 
volatility functions in both spaces: 


dX(F) _ v(X(F)) _ V2F =F) 








= = = = (3.118) 
dF OF)  o(F—F)(F—-F) 
This implies the following monotonically increasing map: 
2, |F-F 2, F-F 
XP) = 2 tog E| = iog = f (3.119) 
o F-F o F-F 





This is a one-to-one map of the domain x € (—oo, +00) into the domain F e€ (F, F), with 
inverse relation 


F 
foe = (3.120) 
l+e 
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That is, X(F) = —œ, X(F) = œ, F(—co) = F, F(oo) = F. As shown later in this section, 
the form in equation (3.117) is convenient for showing that the solutions to the quadratic 
(double-root) model directly recover the corresponding known solutions to the linear volatility 
(i.e., affine or lognormal model) by simply taking the limit F — œ in all expressions. This 
is in fact a mathematical consistency requirement of the theory. 

By specializing equation (3.259), the exact relationship between a transition probability 
density function, or pricing kernel, U for the quadratic volatility model and a kernel u for the 
Wiener process is given by [i.e., the analogue of equation (3.87)] 





UE, Fi) = ZF- | CODEF ore 
f (F—F)(F—F) 





u(X(F); X(Fo); T), (3.121) 


F, Fy € (F, F). This equation relates the density for the quadratic model to that of the simple 
Wiener model. By direct substitution and by using equations (3.7) and (3.8), one can verify 
that U(F, Fo; T) in equation (3.121) satisfies both forward and backward time-homogeneous 
Kolmogorov equations in F, F, for the zero-drift function and volatility function given by 
equation (3.117). Later in the chapter, the reader will learn to derive this relation based on 
the canonical diffusion mapping methodology. 

Following a similar procedure to that in the previous section, the pricing kernels for the 
barrier-free case as well as for single and double barriers arise by direct substitution of the 
x-space kernel u satisfying the appropriate boundary conditions. In particular, zero-boundary 
conditions, U(F = 0, Fy; T) = U(F = œ, Fy; T) = 0 (with the same conditions also holding 
in Fo), are satisfied by substituting the solution u(x, x93 T) = go(x, Xo; T) of equation (3.9) 
into equation (3.121). Upon using equation (3.119) and rearranging logarithmic terms, the 
barrier-free kernel is given in exact form: 





=F) | ®-DE-/) on 
ovm (F—F)(F-F) 


» (F—F)(F-Fy) 
2027” FFE, F l 
( )(Fo — F) 





U(F, Fo; T) = 





x exp | (3.122) 


This kernel may be compared to the zero-drift ee density kernel in equation (3.88), 
which obtains as a simpler case in the limit F —> oo, F = 0. For computing integral expec- 
tations (i.e., for pricing purposes) it is convenient to An in terms of the x variable. Using 
equation (3.120), F = F(x), and equation (3.122) gives the density (see Problem 1) 


ee cosh(ox/2/2) eon 
2/7 T cosh(ox,/2/2)_ 
Xo = X(F)). From this it readily follows that the barrier-free kernel conserves probability 


(see Problem 1). The price of a plain European-style option maturing in time 7 is given by 
(ignoring discounting) 





U(F(x), Fo; jo = (3.123) 


V(Fy, 7) = [ i U(F, Fo; )A(F)dF. (3.124) 
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Calls or puts on the forward price struck at K with payoff A(F) = (F — K), or A(F) = 


(K — F),, respectively, are readily priced. In particular, the price of a call with F<K<F 
takes the form (ignoring discounting) 


C(F,, K, 1) = 6(K) — K4(K), (3.125) 
where 
o(K) = f j U(F, Fy: dF,  $(K)= f : U(F, Fy; 7) FaF. (3.126) 


By changing integration variable F — x = X(F) and using equation (3.123), 


o 


es? 1 2 2 
K)= : -f h (22) e70, 3.127 
PO = 9 Tar on (Ey ro t Ae i d 





with X(K) = va log[(K — F)/ (F K)]. This integral is readily evaluated via the identity (see 
Problem 3) 











f etri e000) Mt gy = 2 nre whe FN (d(H), (3.128) 
X(K) 
where 
= (F-K)(Fo-F) (3.129) 
(K —F)(F—Fy) 


and N(-) is the standard cumulative normal density function. Throughout this section we define 





pag ey, 
a logx+ 50°T 


d (x)= E ak 


From equation (3.128) and using x) = X(Fọ), namely, ens = [(Fo — F)/(F — F,)]3 and 


cosh( 28) = 4 (F — F)[(Fy— F)(F F)|-2, we hence obtain 


(3.130) 








o(K) = (F— F)"'| (Fy — F)N(d,(X)) + (F— Fy)N(d_(X))]. (3.131) 


The second integral in equation (3.126) is evaluated in similar fashion, namely, by changing 
integration variable F —> x = X(F), using the identity 


ita = 5 
F(x) cosh(ox/2V2) = ; [Pe Pe? Fer A). (3.132) 
which follows from equation (3.120), and integrating with the use of equation (3.128), 
$(K) = (F —F)"[F(Fo — F)N(d,(X)) + F(F — Fy) N(d_(X))]. (3.133) 
Combining equations (3.131) and (3.133) finally gives the call price: 
C(F. K, 7) = (Ë — PYF —K)(Fy — F)N (d, (2) 
— (K — F)(F —Fy)N(d_(X))]. (3.134) 


The put price is derived in similar fashion (see Problem 4). 
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For a single barrier with absorption at level H, F < H < F , the pricing kernel is obtained in 
exact form by simply substituting the kernel u(X(F), X(Fo); T) = g"(X(A), X(F), X(Fo); 7) 
of equation (3.14) into equation (3.121), giving 





(F=F) | Fo-F\F Fo) nn 
oV/207\| (F —F)3(F —F)3 
| l Ee 
x) exp z_l0 = = 
20°T  (F—F)(Fy—F) 


l it. PSP Pye 2il 
exp log —= = = . 
20°T ° (F-F)(F-R)(H- FY 





U(H, F, Fo; T) = 











(3.135) 





The boundary conditions U(H, F = H, Fy; T) = 0, U(H, F, F) = H; T) = 0 are obviously sat- 
isfied. This kernel is hence useful for pricing single-barrier options for the quadratic (double- 
root) volatility models. Note that the kernel in equation (3.89) obtains in the limit F > oo, 
F =0. Exact formulas for single-barrier knock-in and knockout calls/puts are most readily 
derived by changing variables of integration, as was done in the earlier barrier-free case. 
In particular, using equation (3.120), F = F(x), and equation (3.135) gives the analogue of 
equation (3.123): 





o2 ox 
e777 cosh( A] @-10)2 G@+x0-2xH)2 
e ar — e 4T 


], (3.136) 





U(A, F(x), F; -a 
s PX), £093 T = Tx 
dx 2/TT cosh(5 


Xp = X(Fo), Xy = X(A). 

European-style single-barrier knock-in and knockout option price formulas are then 
derived by integrating the density in equation (3.136) against the pay-off in the appropriate 
domain. In what follows we derive the knockout option prices as the knock-in prices follow 
simply from (knock-in)-(knockout) symmetry. A down-and-out call option, expiring worth- 
less if the barrier F = H is crossed before a time to maturity T with current (forward) price 
F> H, F < H, K < F, has value (ignoring discounting throughout) 


i 
C°(H, Fy, K, 7) =o, U(H, F, Fy; 1)(F — K), dF 
H 


$(K)- KK), K>H 


= | i (3.137) 
$(H)—Kd(H), K<H, 


where (-.) and (-) are defined by 
oe) = f ute F, Fo; T)dF, ie) = f ute F, Fy; T)FdF (3.138) 


any real value B such that F < B < F. Following similar steps as earlier, these integrals are 
reduced to standard cumulative normal functions. Changing variables F —> x = X(F) and 
using equation (3.136), 
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es? 1 % (x—x9)? @+x0-21H)? 
K) = — f cosh (= )[e = —e a dx 
$(K) 2./7T cosh (3) X(K) GHI l 
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This integral is evaluated in two parts. The first exponential integral [i.e., identically as in 
equation (3.127)] is given by expression (3.131). The second term is integrated in identical 
fashion by using the same integral identity (3.128) with the replacement x) > 2x, — x) and 


; oe _ v2 (H-F)? (F-Fo) 
then using 2x,, — Xo = 2X(H) — X(Fy) = ~ log Fm Fy” 





ersn—sN? = TH — F)/(F — H)\(F — F)/(Fo— FY}? (3.139) 


Combining the results of the two integrals: 


$(K) = FPy"] Gy —Pn(a, @) + ERY) 


H-F 


~( 2— )(F-F)N(d,(Y)) - F-H (Fy — F)N(d_(Y)) |, (3.140) 
F-H H-F 


where d,(-) and X are given by equations (3.130) and (3.129) and 











yi K)(F —Fy)(H— FY? 


- UAE , (3.141) 
(K -F)(F,— F)(F - H} 





The second integral in equation (3.138) for B = K is evaluated in similar fashion, namely, 
by using equation (3.123) and identity (3.132): 
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= est 1 F (” _ ox _ (exp)? (tag —2xy) 

K)= f e wie 3 e a dx 
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F oe OX (x=x9)? @+x0-2xH)? 
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+ prar 
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Applying equation (3.128) on each of the four terms and expressing the result in terms of Fo, 
H, K while using equation (3.139) and simplifying gives 


$(K) =(F-F)"! [Fe ~ PN (d (X) + FR — FN (d (2) 








H-F 


(2 \é Fy)N(d,.(Y)) (Ver, Fn(a_) | (3.142) 


Combining equations (3.140) and (3.142) in equation (3.137) hence gives the exact down- 
and-out call price for K > H: 


C?°(H, Fy, K, T) =(F—F)! G& K)(Fy — F)N (d (X)) + (F — K)(F — Fy) N (d_(X)) 





+(K-Ä (Ë Fy (2—)m(a,@) 





+ (K —F)(Fy (4) mia.) (3.143) 


By taking the limit F > œ of this expression, the reader can easily verify that the exact 
formula for the down-and-out price for the affine linear volatility model is obtained and that 
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in particular by also setting F = 0, equation (3.100) (i.e., the price assuming a lognormal 
density model) is exactly recovered, as required. 

For K < H the down-and-out call price is obtained by evaluating #(H) and $(H). These 
quantities are derived by replacing the lower integration value X(K) by X(H) in the foregoing 
integrals for ¢(K) and (K). This is equivalent to setting K —> H in equations (3.129) 
and (3.141), giving 


$(H) = (FF) [e - F)N(d (4) +Ë- F)N(d (A) 








- (Fog) Prom (Fe ME) e 


$(H)=(F- F)! [ře — F)N (d (A)) + F(F — Fy))N(d_(A)) 








()é Fy)N(d,(B)) (Ban Fn(a_@))| (3.145) 


R 
F-H H-F 
where 
r E E E ama 
(H —F)(F — Fy) (Fo —F)(F —H) 


and d,(-) is defined by equation (3.130). Combining these expressions gives the analytically 
exact down-and-out call price for K < H: 





C°°(H, Fy, K, 1) = (F - F)! |ê- K)(Fy— F)N (d (A))+ (F—K)(F — Fy)N(d_(A)) 





+ -AF-Ay(F— nC) 


+ (K—F)(Fy h(E) E] (3.147) 





The limit F —> œ of this expression reduces to the exact formula for the down-and-out price 
for the affine linear volatility model; and further, by setting F = 0, equation (3.99) is also 
recovered. The price of a down-and-out put is derived in similar fashion (see Problem 5). 

An up-and-out call option, expiring worthless if the upper barrier F = H is crossed before 
a time to maturity 7 with current (forward) price Fy < H and F < K < H < F, has value 
(ignoring discounting) 


CUO(H, Fy, K, 2) = f U(H, F, Fy; D(F —K)dF 
= $(K)- K¢(K) —[6(H) - K$(H)] (3.148) 


and value zero for K > H. The second expression obtains by writing the integral as the differ- 
ence of two integrals [one from K to F and the other from H to F within equation (3.138)]. 
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Using equations (3.140), (3.142), (3.144), and (3.145) (or taking the difference between the 
two down-and-out call prices) gives the up-and-out call price (excluding discounting) 


C!(H, Fy, K,1) = (P-P Gc K)(Fy— F)[N(d,(X)) —N(d,.(A))] 


+(F — K)(F — Fy) [N(d_(X)) — N(d_(A))] 





+K- Ë ro( Jer.) -N(a,()) 


-F 
-H 


e> [i an) 





+(K — F) (F, A(z E)N) -Nla 8). (3.149) 


The limit F —> œ of this expression reduces to the exact formula for the up-and-out price for 
the affine linear volatility model; and for F = 0, equation (3.61) is recovered for r = 0 and 
So = Fy. The price of an up-and-out put can be derived in similar fashion (see Problem 6). 

We now present the valuation of European-style double-knockout-barrier options with 
underlying asset price F, satisfying the driftless process, with quadratic volatility function 
as in equation (3.117). Although the mapping and the functional relationship between the 
kernels in F-space and x-space differ, the procedure is similar to the one employed for the 
linear volatility. In particular, we impose zero boundary conditions at both barrier endpoints 
L (lower barrier) and H (upper barrier) of the double-knockout-barrier pricing kernel, which 
we denote by U?®, where F < L < H < F. The values L, H are mapped onto the x-space 
endpoints via equation (3.119): 


2. H-F Os DERF 
pate ae. Bae 
o F-H o F-L 








(3.150) 


From equation (3.121), an exact series for the F-space density follows by simply substi- 
tuting the x-space transition density satisfying zero-boundary conditions; i.e., we substitute 
equation (3.101) into equation (3.121) while using equation (3.119), giving 














2(F —F Fy —F)(F-F 
U?®(F, Fy: 1) = 4 D (Fo é IE 0) 
-F\(F- _P)3(F—FY3 
log ChE (F —F)3(F —F) 
x oe" sin (nary(Fo)) sin(n7y(F)), (3.151) 
n=1 
where F, Fy € [L, H] and 
log Dën 
X(F)— X(L 8 LAF- 
iS 2 ee (3.152) 
X(H)—X(L) ` jog AED 
(L-F)(F-H) 
o? nao? 
pam — (3.153) 
8 2log GAHE-D 
(L—F)(F—H) 


This series possesses the same rapid convergence properties as equation (3.102) for positive 
T and also represents the Dirac delta function 6(F — F}) for the finite domain [L, H] when 
7 =0. A double knockout option maturing in time 7 with payoff A(F) is priced using 
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equation (3.105), with UP? now given by equation (3.151). In particular, the value of a double 
knockout call is given by (excluding an overall discount factor) 


H 
CPB(F,, K, 7) = / U?®(F, Fy; t)(F — K), dF 
L 


&(K)—K(K), K>L 
= (3.154) 
$(L)-K¢(L), K<L, 


where $(-.) and (-) are defined by 
$(B) = Í "UP8(p Fy: dF, $(B) = f " UPB(F, Fy; 0) FAP, (3.155) 


for any real value B € [L, H]. As in the single-barrier case, these integrals are most readily 
evaluated by changing variables using equation (3.120). From equation (3.151), 
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U~? (F(x), Fo; 7) ie Rae ee PED cosh -75 
8 LEH) 
x Soe" sin (nay(Fy)) sin ee (3.156) 
n=1 
and the integrals in equation (3.155) give 
2026 | [(Fy — F)(F — Fy)? 
B “Pn? sin (n7ry(Fo))1,(B 3.157 
(Os Lesin (aE) 157) 
(L-F)(F-H) 
220 [(Fy—F)(F-Fy)]? & P 
(B) = v2o [( WESE Y e% sin (nary(Fy))T,(B), (3.158) 
F-F log GPF S 
(L-F)(F—H) 
where 
1,(B) = f° cosh 57 sin ze =x dx, (3.159) 
XB 
i,(B) = f F(x) cosh 32 sin “Tew dx, (3.160) 
XB 
Xg = X(B) = 2 log a! These integrals are readily evaluated in exact closed form (see 
Problem 7): 
pa 
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ee H -|H-F 
1,(B) = n(—1)" -= + F| =— 
2 T log HAF) FPF- z| HOF F- 


(L— F)\(F— H) 


1 (H-F)(F-L) [FRB 2 |B-F 
lo (n = = 
T m ge F)(F—H) sale) B-F F-B 
+ ncos (nary(B)) Hi a +F pa an (3.162) 


Finally, from equation (3.154) we see that using these expressions within equations (3.157) 
and (3.158) for B = K and then separately for B = L and simplifying gives: 
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(@-H(H-F) y(Ē-L(L-F) 


for K < L. This last expression has a simpler form since y(L) = 0. Note that the two formulas 
are identical when K = L. An example of the rapid convergence of these series solutions 
is given in Figure 3.7. The limit F — œ of these expressions gives exact formulas for the 
case of an affine linear volatility model; further, by setting F = 0, we also exactly recover 
equations (3.106) and (3.107), respectively. Figure 3.8 demonstrates this explicitly. Indeed 
for a given double-barrier call option contract, one observes uniform agreement of the option 
prices for the quadratic model with those of the linear model, as the quadratic volatility 
function is made to coincide more and more closely with that of the corresponding linear 
volatility function. 


Problems 
Problem 1. 
(a) Using equation (3.120) show that 
[(F(x) — F)(F — Fœ]? = 2(F — FY! cosh(ox/2V2), (3.165) 


and use this relation and the derivative F’(x) to arrive at equation (3.123). 


188 CHAPTER 3. Advanced topics in pricing theory 


(b) Use the identity f% eP/2N2 9-0/4 dy = 2,/ TE 8 e%0/2¥2 to show that the 
barrier-free density satisfies 


F fee} dF 
| U(F, Fy; aF = | U(F(x), Fy; 7) dx= 1. (3.166) 
F oo dx 
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FIGURE 3.7 Rapid convergence of the double knockout call price across the full range of spot Fo 
as one includes only the first 1, 2, 8, 16, and 32 (thick solid line) terms in the series (3.163), where 
L = 10, H = 50, K = 20, 0 = 0.2, T = 0.25. 
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FIGURE 3.8 Uniform approach of the double knockout call price for the quadratic model [given by 
equation (3.163)] to that of the linear model given by equation (3.106), as F is pushed to larger values. 
The five thinner curves represent the converged price [i.e., using equation (3.163)] for the quadratic 
model for the separate cases of F = 60, 120, 240, 480, and 3200. The curve for F = 3200 is very close 
to the thick solid line representing the price given by the linear model for the same parameter choice: 
L = 10, H = 50, K = 20, ø = 0.2, t = 0.25. 
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Problem 2. Using parts of Problem 1, show that equation (3.123) leads to the martingale 
property: 


3 
Eo[F,] = E[F, |F,- = Fo] = iS U(F, Fy; 1)FdF = Fy. (3.167) 


Problem 3. Derive equation (3.128) by completing the square in the exponent. Note that the 
identity d4} (1/x) = —d- (x) obtained from equation (3.130) is useful in the manipulation of 
expressions. 








Problem 4. By following a similar procedure as was used to derive equation (3.134), derive 
the exact formula for the corresponding put value. Is a put-call parity relation satisfied? 


Problem 5. Derive an exact formula for the down-and-out put value. 
Problem 6. Obtain an exact formula for the up-and-out put value for K < H and for K > H. 


Problem 7. The integrals in equations (3.159) and (3.160) can be evaluated by rewriting 
them as a sum of integrals of the form 





i H EIN gin E dy, 
x 


AHL 
B 


Use the antiderivative f e™ sin bxdx = e“[asin bx — bcos bx]/(a’ + b’) +c, where a, b, c 
are any constants, and then recast the variables xg, xy, Xz in the resulting integrations in 
terms of the F-space variables B, H, L and arrive at equations (3.161) and (3.162). 
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In this section we present a standard Green’s function framework for finding solutions for 
the x-space kernel subject to homogeneous boundary conditions. Throughout this section we 
shall assume one-dimensional diffusions, i.e., a diffusion process x, obeying 


dx, = A(x dt + v(x, )dW,, (3.168) 


with W, as the standard Wiener process. This process is assumed to have a differentiable 
drift function A(x) and a twice differentiable diffusion function or volatility function v(x). 
The goal is to solve for the kernel or density u(x, x9; T), subject to appropriate boundary 
conditions. 

Since the drift and volatility functions are assumed to have no explicit time dependence, 
the kernel u = u(x, Xo; T) satisfies the time-homogeneous forward Kolmogorov equation 


ðu 18 $ ð 
ae saa (He) n) — (awu) = Lyu (3.169) 
and the corresponding backward equation 
ðu 1 „0u ðu ~ 
Ss — +A(x%))— = £, u, 3.170 
op 7 7M) age t+ AC) 9 = Eat (3.170) 


subject to the initial condition u(x, x9; 0) = 6(x — x9). As in Chapter 1, we have defined the 
Fokker—Planck differential operator £, that acts on the variable x and its formal Lagrange 
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adjoint Lis. acting on xo. One technical point to note is that the differential operator £ is 
generally not self-adjoint, i.e., re Æ £, and the solution for the transition density is generally 
not symmetric with respect to interchanging x and xy. However, as is seen from the transfor- 
mations provided next, the corresponding time-independent Green’s function technique for 
solving either forward or backward equations can be treated within a common footing. 

In developing a solution framework for u(x, x9; T), we consider the corresponding time- 
independent Green’s function G(x, x9; s), which is defined via the Laplace transform with 
respect to time: 


G(x, X93 8) = Ll u(x, Kee 7) ][s] = Ii e u(x, Xo T)aT. (3.171) 


[Without loss in generality, we shall assume that u is absolutely integrable with respect 
to 7 on any interval 0 < 7 < T and that G(x, x9; s) exists for some real value of s = a. 
Then from the theory of Laplace transforms it can be shown that G(x, x9; s) is an analytic 
function on the complex s-plane for Res > a. As will be seen, what is important to keep 
in mind for the discussion at hand is that the function G(x, x9; s) is uniquely determined 
by satisfying appropriate boundary conditions in x, for Res > a.] Taking Laplace transforms 
with respect to time 7 on both sides of forward equation (3.169) while making use of the 
well-known identity for the Laplace transform of the derivative of a function and the initial 
delta function condition on u gives a nonhomogeneous ordinary differential equation for the 
Green’s function G = G(x, xo; s), 





1 dad ; d 5 a 
5a (09 a) -£ (as) sG = L,G — sG = —ô(x — xo). (3.172) 


Note here that the partial derivatives have been replaced by ordinary derivatives, where one 
is holding xọ (and s) fixed in the Green’s function. In similar fashion, by taking Laplace 
transforms on both sides of backward equation (3.170), one also obtains the adjoint equation 
to equation (3.172): 


1 0G 7 
zro) DO sG = L „G — sG = —ô(x — xo). (3.173) 





Again, the partial derivatives have been replaced by ordinary derivatives, where one is now 
holding x (and s) fixed in the Green’s function. 

Using either of these equations, the objective is now to solve the ordinary differen- 
tial equation (i.e., with delta function as the inhomogeneous source term) for the function 
G(x, Xo; 5), subject to the same homogeneous boundary conditions that are imposed on the 
function u(x, x9; T). Hence either one can solve equation (3.172) with imposed boundary 
conditions in x, or one solves the corresponding adjoint equation (3.173) with boundary 
conditions imposed in x). Upon unique determination of G(x, x9; $), one then has the desired 
unique solution for the kernel u(x, x; T) (which satisfies the same desired homogeneous 
boundary conditions) via the Laplace inversion 


u(x, Xp3 T) = EG, Xo s) Jir = sl e G(x, Xp; 5) ds. (3.174) 


We shall use L~'[F(s)][t] to denote the inverse Laplace transform of a function F(s) evaluated 
at t. This inversion formula, which can generally be used to compute the inverse Laplace 
transform, is the Bromwich contour integral or the Mellin integral arising in the theory of 
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Ims 4 








FIGURE 3.9 The Bromwich contour extends from y—ioo to y+ ioo. 


Laplace and other integral transforms. This contour, depicted in Figure 3.9, is the infinite line 
Res = y on the complex s-plane parametrized by s = y + ir, with real parameter r running 
from —oo to oo. Here y is any real number such that all singularities of G(x, xo; s) (now 
considered as a complex-valued function of s for any fixed real values x,x,) lie to the left of 
the line Re s = y on the complex s-plane. Throughout, i denotes the usual complex number, 
with z = Rez + ilmz, where Re and Im denote the real and imaginary parts, respectively. 
Later we also make use of polar coordinate form of a complex number, z = re’, where r = |z| 
is the modulus and 6 = arg z is the argument of z. 

Once one has obtained G(x, xo; s) analytically, the integral in equation (3.174) is then in 
itself an exact integral representation for the transition density u(x, xo; T). This is partly the 
reason for sometimes also referring to G(x, x); s) as the resolvent kernel. As shown shortly, 
in the analytical evaluation of the Laplace inverse, it proves useful to extend the Bromwich 
contour to form a closed contour integral enclosing the negative real half of the complex 
s-plane. A simple application of the infamous residue theorem of complex analysis then 
further allows us to evaluate the integral either as an exact series or in terms of exact closed- 
form special functions. A rather general procedure for achieving this purpose is to try to close 
the contour in such a manner that the Green’s function is an analytic function of the complex 
variable s everywhere on the closed contour. Yet inside the contour, G may either be analytic 
or have a finite number of isolated simple poles (i.e., simple singularities). After justifying 
the equivalence and hence replacement of the Bromwich integral with the closed contour, 
or loop integral, we then subsequently apply the standard residue theorem to compute the 
result. In more general applications the Green’s function may have a branch point (e.g., due 
to factors such as ./s) that gives rise to a branch cut on the complex s-plane. From complex 
analysis we know that the residue theorem cannot be used to evaluate a contour integral that 
encloses a branch cut. However, there is a standard technique that can be used in such a case. 
This is sometimes referred to as “shrinking the contour onto the branch cut.” This method 
is generally best described by example. Later we give concrete examples of this Laplace 
inversion of G, for the simple case of the Wiener process and also for the more complex case 
of the Bessel process. As we shall see, for the case of finite double barriers one obtains a 
rapidly converging analytical infinite series representation for the transition density u. This is 
the eigenfunction expansion solution for the transition density. Such expansions, as discussed 
briefly in Section 3.6.1, also follow from the spectral theory of eigenfunction expansions. 
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The Green’s function methodology we now present is based on the Sturm—Liouville 
theory of linear ordinary differential equations [MF53, But80, Duf01b, Dav02]. However, 
we specialize the theory to the diffusion equation relevant to pricing theory. In what follows 
it is convenient to make direct use of equation (3.172). That is, we consider solving the 
nonhomogeneous equation (3.172); i.e., we now build the Green’s function G(x, x9; s) by 
considering solutions y(x; s) to the corresponding homogeneous equation 


L,y(x; 8) — sy(x; s) =0, (3.175) 


subject to appropriate boundary conditions. Note: For shorthand we shall also simply write 
y(x) to mean y(x; s), because s is a fixed parameter in the differential equation. In order 
to make use of established results from Sturm—Liouville theory, we shall first transform the 
original equation (3.175) into one in standard Sturm—Liouville form. This is accomplished 
via a transformation to a new function defined by 














È v(x) * Aw) 
= d : 3.176 
F(a) = TE exp (— f° ZE au) (3.176) 
Using this definition we can show by direct differentiation that 
v(x) * Au) Nom ic 
£ = du |} £ ; 3.177 
20) = 20 exp ( f Oa) 2,500) (3.177) 
where the new differential operator L, is defined by 
: d df(x) 
Lf) = Z| Pe) q(x) f(x) (3.178) 
x dx 


for any arbitrary twice differentiable function f(x). Here the functions p(x) and g(x) are 
given in terms of the drift and volatility functions: 





p(x) = Lra’, (3.179) 
r 1 1 A(x) í v'(x) A 
q(x) = 3 h (x) + (=) 2X(x) H V(x)V w| (3.180) 


(Prime is used to denote differentiation.) The operator £ is now in standard Sturm-Liouville 
form and is hence also self-adjoint. One should note here that the Green’s function method- 
ology may also be directly applied to the original nonself-adjoint problem. However, trans- 
forming the equations into the standard self-adjoint Sturm—Liouville form and then solving 
and transforming back proves very convenient, as the whole following analysis shows. 

From equation (3.176), it follows that a related (new or modified) Green’s function G is 
similarly defined as 





G(x, x3 5) = a exp ( [ a au) G(s Xo; $), (3.181) 


leading to the transformed nonhomogeneous equation in Sturm—Liouville form: 


L,.G(x, xo; 8) — sG (x, xo; 8) = —8(x — xo) (3.182) 
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Note that the inhomogeneous term again contains only the Dirac delta function since the prop- 
erty fie 6(x — xo) = 6(x — xo) has been used where the ratio fe is nonsingular [i.e., f(x) = 

v(x) exp(*(A(u)/v(u)*)du)] within the allowable solution region, and evaluates to unity 
when x = x). Equation (3.182) is now the desired standard form, which may be solved subject 
to various homogeneous boundary conditions. Upon solving for G we then simply invert 
equation (3.181), giving G, as shown next. By using standard textbook methods of solution 
for nonhomogeneous second-order ordinary differential equations (e.g., the method of vari- 
ation of parameters), G(x, x9; s) is readily obtained from the solutions (x) = y(x; s) to the 
corresponding homogeneous equation [i.e., the homogeneous counterpart of equation (3.182)]: 








L£,5(x) — sy(x) = 0. (3.183) 


Generally, if y, and y, are two linearly independent solutions to equation (3.183), then the 
Green’s function is readily shown to take the form 


( YO) ¥2%0) 


W = 
2 P 
G(x, xə s)=-į_ (3.184) 
¥(x)¥ (x0) s 
EES > Xo- 
pW 


Here pW = p(xo)W (xo) = p(x)W(x) is a constant independent of x and x» (not constant 
w.r.t. s), as can be shown from the properties of the Wronskian of any two solutions to equa- 
tion (3.183): W(x) = W [Di (x), O] = ¥, (4) 94 (x) — y, (x)(x). The boundary conditions 
for the Green’s function are matched by the choice of the two solutions y,, y,. The reader 
should also note that, since equation (3.184) involves a ratio of the product of two indepen- 
dent solutions divided by their Wronskian, the Green’s function is still uniquely determined 
if we multiply any of the two solutions by an arbitrary nonzero constant. The symmetry 
G(x, xo; S) = G(x, x; s) with respect to interchanging x and x, is also a useful property, 
following from the fact that the Sturm—Liouville operator is self-adjoint. The solution y, is 
chosen to match the boundary condition at the lower region, while y, is chosen to match the 
boundary at the upper region. For example, if one requires zero-boundary conditions at two 
points x = x, and x = xy (x, < xy) with G(x = x}, xo; 8) = 0 and G(x = xy, Xo; 5) = 0, then 
a linear combination of independent solutions to equation (3.183) must be formed to give 
y,(x,) = ¥2(x,) = 0. Inserting the two solutions and their Wronskian into equation (3.184) 
gives G. 

From equation (3.181), the Green’s function to the original problem (3.172) is then 
obtained as 


WOM) ragag 
x Alu 7 ? Ph 
vaged | BE (3.185) 
i ee Ec ae 3.185 
Pe WDC) | 546) 54 (8) 
ADEE Xo LS X < Xy- 
pW 


Here we have assumed that the multiplicative factor to the left of the curly bracket in equa- 
tion (3.185) is finite at the solution endpoints. In the special case of a singular multiplicative 
factor at an endpoint, we assume that either y, or y, approaches zero more rapidly at the 
endpoint so that G satisfies the same zero-boundary condition as G. The foregoing expression 
is applicable to all cases of homogeneous boundary conditions that we shall encounter. [It is 
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noted that this approach can also handle boundary conditions of a mixed kind to accommodate 
for other types of solutions, such as relection at a boundary. However, throughout we are only 
concerned with zero-(i.e., Dirichlet)-boundary conditions for the purpose of pricing barrier 
as well as barrier-free European options for state-dependent volatility models to follow.] 
The points x,, x, can be finite, or either point can be taken in the infinite limit, depending 
on the allowable solution space. Note that, in contrast to G, G is generally not symmetric 
with respect to interchanging x and x). However, by direct inspection we see that G in 
equation (3.185) is a product of functions in x and x) and hence automatically provides us 
with solutions to the homogeneous equation (3.175) and its adjoint equation where the func- 
tions v(x») exp (— f” ae du)¥,(xo) and v(x9) exp (— f” oie du)¥(Xo) form two linearly 
independent solutions to the homogeneous version of equation (3.173); i.e., these form two 
linearly independent solutions to Lað- sy = 0. Renaming variables x) > x and s > p hence 
gives a general solution to this homogeneous ordinary differential equation rewritten in terms 
of x in equation (3.272), which we obtain simply by inspection of the Green’s function G. 
We shall denote this solution by u(x; p), where 








a(x; p) = (aye P gD (x5 p) + aE: p)] (3.186) 


and q,,g are arbitrary constants. The function u(x; p) (in Section 3.8.1 it is referred to as a 
generating function) will turn out to play an important role in generating new pricing kernels 
for an F-space process from known x-space kernels, as is discussed later in this chapter. 

In closing this section, we demonstrate the Green’s function procedure with a standard 
example covering the different cases of boundary conditions. 


Example 5. The Wiener Process. 


Let’s consider the process dx, = V2 dW,, where v(x) = V2 (constant volatility) and A(x) = 0 
(zero drift). From equations (3.179) and (3.180), the functions p(x) = 1 and q(x) = 0 are 
trivial. In this special case £ = L = £, and G = G, which satisfies 


d? 
dx? 





G — sG = —ô(x — xo), (3.187) 


where y = y satisfies the corresponding homogeneous equation [i.e., equation (3.183) 
or (3.175)] y” — sy = 0. Two independent solutions of this equation are ev* and e~¥*. If 
we seek barrier-free kernel solutions, then we impose zero-boundary conditions at x —> too 
(e.g., x, = —oo and xy = œ in the earlier notation). Therefore we let y, = eY“, J, = e~¥™* 
since etV¥** —> 0 as x —> Foo for real values of s > 0 (also generally true for Re s > 0). 
The Wronskian of these two solutions gives pW = —2,/s. Using equation (3.185), where the 
multiplicative factor is just unity, gives the Green’s function for the barrier-free case: 








ev=) 2/5, X < Xo 
G(x, X93 8) = (3.188) 
ev=) 12/5, X> Xo 


Obtaining the kernel u(x, x9; T) is now just a matter of Laplace-inverting this function from s 
back into the time 7 domain. Note that we may rewrite G = e~'V°/2,/s, k = x” —x< (k > 0), 
where x* (x*<) stands for the greater (smaller) of the two real numbers x, x). The Laplace 
transform can in some cases be found directly with the use of tables. For instance, in this 
case one can look up a table of transforms to find L~'[e*v°/,/s][7] = e% /,/ ar. Since 
k? = (x — xo)?” (regardless of the relative magnitudes of x and x), we then recover the known 
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solution u(x, X93 T) = B(x, Xo; T) exactly as in equation (3.9), and the problem of obtaining 
the barrier-free kernel has been completely solved. 

It is now instructive to show how the inverse transform is computed using standard 
techniques of complex analysis (without the use of tables because this is particularly important 
for handling nonelementary Green’s functions, as discussed later). G is an analytic function on 
the complex s-plane, except for a branch point at s = 0 due to the ./s factor. For this purpose 
a branch cut must be introduced. In order to apply the residue theorem, we consider the closed 
contour in Figure 3.10 (see Section 3.7.1) with branch cut chosen as the negative real line 
arg s = 7, with complex s-plane |arg s| < 7, i.e., the principal branch. The Bromwich contour 
corresponds to the line segment MN. Since G is analytic everywhere (i.e., no singularities) 
on and inside the entire region within the closed contour for all values of the semicircular 
radius R > 0 as well as for positive parameters p, 6, and y taken arbitrarily close to zero, 
Cauchy’s integral formula gives a value of zero for the complete loop integral. Hence the 
Bromwich integral is equal to the negative of the sum of all the other contour integrals that 
make up the closed loop. From this fact, the kernel is 


u(x, Xo; T) = 


] pytice 
f e” G(x, Xo; s)ds 


271i J y—ic0 


Q P' —ky/s 
=- h+, +} +f jee (3.189) 


This sum of integrals is dramatically reduced using standard arguments as follows. Tak- 
ing limits R — œ and y,p,5 — 0, the Cz integrals vanish, since along the semicircular 
contours s = Re”, 5 < |0| < 7, hence cos 6 < 0, so the modulus of the integrand |Ge*"| = 
eR 0s 8 ok Reos(5) /2./R —> O as R > oo. The C, integral for the circular segment QQ’ also 
vanishes since s = pe, —m < 0 < m, so the modulus (as p — 0) of this integral has value 
< ./p x const., which goes to zero in the limit p —> 0. The only nonzero integrals are along 
the branch cut corresponding to the PQ and Q’P’ segments, where s = re'” (./s =i,/r) and 
s = re`!" (./s = —i,/r), respectively, in the limit 6 — 0, with p < r < R. In the limits p > 0 
and R — oo the two integrals are combined to give the real-valued integral 











lL p” cos(k./r 1 

u(x, xo; 7) = = ik et = dr=5 wae (3.190) 
where the last result is g9(x, x9; T), as before, and was obtained by a change of integration 
variables resulting in the cosine transform of a Gaussian function giving a Gaussian in k. 

Barrier kernels for the Wiener process are also readily obtained. The Green’s functions 
provide solutions that relate directly to the method of images, discussed partly in Section 3.2.1. 
In particular, let’s reconsider the problem of finding the kernel in the domain x, x) € (—0, xy] 
for a single upper barrier at level xy. [The steps for the case of a lower barrier are the 
same.] Since we wish to impose zero-boundary conditions for the kernel at x, = —oo and 
at Xy, we form a linear combination of e¥** and e~v* to set y (x) = sinh /s(x — xy) 
and set y,(x) = eY“. Hence J (xy) = 0, ¥,(—oo) = 0 for Res > 0, as needed. In this case 
pW = /sev* and the Green’s function is 





1 > < rf rf 
G(x, Xo; 5) = zzl We g OnE ad (3.191) 


since x7 +x< = x+ xo. This involves the difference of two expressions of the same functional 
form in s as in the barrier-free Green’s function. Hence, Laplace-inverting gives precisely the 
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kernel u(x, xo; T) = g" (Xy, X, Xo; T) Of equation (3.14). This result was previously derived by 
the method of images. 

The last case of interest is the kernel having zero value at two finite endpoints xz, Xy (i.e., 
the double-barrier) with solution domain x, x9 € [x,, xy]. Following similar steps as before 
we have 


sinh ./s(x* — xz) sinh ./s(x* — xy) 
Jssinh ./s(xy — x1) l 


Note that this function is zero at both endpoints. This Green’s function leads to two separate 
types of exact series expansions for the kernel. The first type is an eigenfunction expansion. 
The relation between eigenfunction expansions for diffusion kernels and Green’s functions 
is discussed in the next section. Here we show explicitly how such an expansion arises 
from the Laplace inversion of equation (3.192). Observe that G is a ratio of two analytic 
functions of complex s, despite the appearance of the ./s factor. Indeed this can be seen 
by a direct Taylor expansion of the hyperbolic sine in both numerator and denominator. 
The only singularities of G are isolated simple poles along the negative real axis. In fact, 
using the identity sinh(ix) = isin(x) and letting s = —|e|, the denominator of G along the 
negative real axis is ./ssinh./s(xy —x,) = — Jel sin Viel (xn —x,). Therefore the zeros 
of the sine function give the simple poles of G at positions s = €, = -n° a? /(xq — x,)’, 
n=1,.... Note that s=0 is a removable singularity in this case, as is shown by a Taylor 
expansion of the denominator about s = 0. The Bromwich integral can therefore be closed 
by joining a single semicircular contour Cp enclosing the negative real half of the complex 
s-plane, as long as the contour does not coincide with any of the isolated poles. Since the 
modulus of the integrand in the Cx integral approaches zero as R — oo, the residue theorem 
gives 


G(x, Xo; S) = 





(3.192) 


u(x, xo; T) = X ec” ResG(x, xo; S = €,). (3.193) 


n=1 


Since the Green’s function is a ratio of two analytic functions, e.g., G(s) = P(s)/Q(s), where 
Q(s) has simple zeros at s = €,, then from complex analysis we know that the residue 
at each pole is given by Res G(s = €,) = P(é,)/Q'(é,,). Evaluating the derivative of the 
denominator in equation (3.192) and the numerator at each pole while making use of the 
identity sinh(ix) = i sin(x) and recasting one of the resulting sine functions in the numerator, 
we obtain 

2 _ nm(x—x;,) an NT(Xy — Xz) 


Res G(x, Xo; S = €,) = sin 
XH XL XH XL XH ŽL 





; (3.194) 


which is valid regardless of the relative magnitude of x and x). Substituting into equa- 
tion (3.193) therefore recovers the kernel (3.101). Recall that this kernel was used in 
Section 3.5 to generate rapidly convergent exact series solutions for the affine and quadratic 
(with two distinct roots) volatility models. 

Green’s function (3.192) can also be used to generate a second type of exact infinite 
expansion for the kernel, which is not based on eigenfunctions but rather gives exactly what 
one would obtain by applying the method of infinite images. The idea is to reexpress G 
in a Taylor expansion involving an infinite sum of exponential terms, which upon Laplace 
inversion gives rise to an infinite sum of kernels of the barrier-free type that will be centered 
at the image points located at a sequence of increasing distances from either side of the 
solution domain. We leave this as an exercise for the interested reader (see Problem 1). 


3.6.1 
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Problems 


Problem 1. By using the Taylor expansion identity 1/ sinh x = 2 X? o e7Cr+D show that 
the Green’s function in equation (3.192) is given by 


1 foe} 
> 2/8 n=0 


— e V9OF 425 — 2x2) L evs Cxu" = (3.195) 





G enn are vsx =x) 7 O) e vs? —x*) 





By taking the Laplace inverse of this series, obtain an infinite series for the kernel. 


Problem 2. Verify that 


o0 


ies T= >” Ec Xo + 2nL; T) — g(x, 2nL — xo; n|, (3.196) 


n=— 00 


where gy is defined by equation (3.9), T = t — tọ, is a solution to equations (3.7) and (3.8) in 
the finite domain 0 < x, x, < L. Determine the boundary conditions at the endpoints. 


Eigenfunction Expansions for the Green’s Function and the Transition Density 


Green’s functions are intimately tied to the eigenvalue-eigenfunction problem of the cor- 
responding homogeneous equation. Here it suffices to give only the most basic and brief 
discussion of this useful aspect of the theory. In particular, as an alternative to the closed- 
form expressions of the previous section, it is sometimes useful to consider Green’s function 
solutions directly in terms of eigenfunction expansions when possible. Let us again consider 
equation (3.183). This equation, together with the imposed boundary conditions, constitutes an 
eigenvalue problem of the Sturm—Liouville type. For the case of zero-homogeneous boundary 
conditions at two finite boundaries it follows from regular Sturm—Liouville theory that if the 
functions p(x) and q(x) in equations (3.179) and (3.180) are well behaved (i.e., p(x) > 0 and 
p, p’, q are continuous in a finite solution domain [x,, xy]), then the Green’s function admits 
a spectral resolution of the form 


S bn(X) Gn (Xo) 


G(x, xo) = >> (3.197) 
n=1 S— E, 
where the eigenfunctions ¢,,(x) satisfy the eigenvalue equation 
Lb, (x) = €,G,(x) (3.198) 


with eigenvalue €, and boundary conditions @,(x,) = ¢,(*,) = 0. The expression in equa- 
tion (3.197) is readily verified to satisfy equation (3.182) by differentiating, term by term, in 
the sum and using upcoming equation (3.200). Also from Sturm—Liouville theory we have 
that the eigenvalue spectrum e€, = —|e,|, n = 1,2,...,00, for a regular problem is real and 
discrete (infinitely countable) where |e,,| form an increasing sequence. The corresponding 
eigenfunctions ¢,,(x) form a complete orthonormal basis set with 


Om by) =f" bu bu (0) = Syn (3.199) 
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where 6,,,, = 1 for m =n and is otherwise zero. Note that completeness of the functions 
also gives 


De bnl*) Pn(Xo) = lx — x0), (3.200) 


n=1 


so any smooth function f(x) admits an eigenfunction expansion 
Sœ = Ve an Gn) (3.201) 
n=1 


with coefficients a, = (f, d,). Assuming we have determined the eigenfunctions (x), the 
original Green’s function G(x, x; $) is then given by equations (3.197) and (3.181). Substitut- 
ing this form into equation (3.174) and taking the inverse Laplace transform operation inside 
the summation gives a formal eigenfunction series solution representation for the kernel: 


x AG) yy 
ol XO v(x! v(x) 


V(x) /(%o) £ 


x AG) gy 
el vane 4 


= areas pee Ib, (x), (Xo): (3.202) 





u(x, X93 T) = 


AA = | 





Note that in the last step the Laplace transform is trivially known and one does not really 
need to resort to the residue theorem to compute the Laplace inverse transform. This result 
also follows, though, from a straightforward application of the residue theorem by closing 
the Bromwich contour with an infinite semicircular portion to the left and thereby picking 
up the contributions from the residues occurring at the simple poles of G that lie along the 
negative real axis. 

Equation (3.202) is a generic series solution for the kernel when A(x), v(x), the solution 
interval being considered, and the imposed boundary conditions all combined are such that 
one indeed has a regular Sturm—Liouville problem at hand, i.e., if it is true that the Green’s 
function (G or G) has the assumed discrete eigenfunction-eigenvalue expansion. In many 
applications, however, the Sturm—Liouville problem of interest may not be of regular type 
but, rather, of so-called singular Sturm—Liouville type. This situation occurs in a variety of 
cases, such as when p(x) in equation (3.178) attains a zero value at either solution endpoint or 
the functions p, q become unbounded or the solution interval is unbounded (e.g., x € [0, 00), 
(—oo, œœ), etc.). The eigenvalues may not be discrete in such cases, and the problem may have a 
continuous or a mixed eigenvalue spectrum, in which cases the generic formulas are generally 
not valid. Even in singular Sturm—Liouville problems for which the spectrum is discrete, 
the convergence of the eigenfunction expansions must also be examined on an individual 
basis. However, a substantial class of important singular Sturm—Liouville boundary value 
problems involving the so-called hypergeometric and confluent hypergeometric equations 
(such as Bessel’s equation for which an in-depth Green’s function development is given in 
the next section) can still be treated within the earlier eigenfunction formulation. This class of 
problems will generally admit a spectral resolution (or decomposition) of the Green’s function 
G as well as the kernel u as a sum of a discrete and a continuous eigenvalue-eigenfunction 
portion. In closing this section, we emphasize that the complex contour integral framework of 
the previous section has a general applicability. In particular, it is applicable to most singular 
Sturm—Liouville problems of interest and can be shown to recover the spectral decomposition 
formulas. In fact the approach of the previous section is used in the next section to arrive 
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at analytically closed-form kernels for the Bessel process involving Bessel’s equation. The 
procedure for extracting the kernels analytically is then basically an advanced exercise in the 
application of the residue theorem of complex analysis. 


3.7 Kernels for the Bessel Process 


3.7.1 


In this section we apply the Green’s function methodoloy of Section 3.6 to the so-called 
Bessel process and obtain exact analytical solutions for the kernel u(x, xọ; T) for all cases 
of interest: (1) no absorption (barrier free), (2) absorption at two finite endpoints (double 
barrier), (3) absorption at a single upper endpoint (single upper barrier), and (4) absorption 
at a single lower endpoint (single lower barrier). 

The Bessel process is characterized by a square root volatility> function v(x) = 2./x, and 
drift A(x) = A = const.: 


dx, = dt+2./x, dW,. (3.203) 


Moreover, throughout we consider A > 2, where all path values are strictly positive x, > 0. 
The allowable domain for the kernel is hence x > 0. The corresponding Sturm—Liouville 
operator in equation (3.178) has p(x) = 2x and q(x) = w*/2x, where u = a 1 > 0 and 
equation (3.183) takes the form 


s 
2x 





2 
OESS ( +f ) =0. (3.204) 
x 4x? 

By a change of variable this equation leads to the modified Bessel’s equation [see equa- 
tion (3.374) in Appendix C to this chapter], as one can readily verify. Two linearly independent 
solutions to equation (3.204) are y,(x) = I, (/2sx) and y,(x) = K n (/2sx). Here I, „ and K, 
are the modified (i.e., hyperbolic) Bessel functions of the first and second kinds, respectively, 
of order u > 0. These functions are also commonly called the Macdonald functions (see, for 
example, [AS64]). For convenient reference, some common useful properties of the Bessel 
and modified Bessel functions, are given in this chapter’s Appendix C. These functions are 
linearly independent for all values of u; hence linear combinations of these two solutions can 
be used to satisfy the appropriate boundary conditions for the Green’s function G (and G) 
and hence for the kernel u(x, x9; T). 


The Barrier-Free Kernel: No Absorption 


Let us consider the case of homogeneous boundary conditions at the endpoints of the entire 
positive region (0, co). The exact kernel is now readily obtained in analytically closed form. 
To begin with, the density must satisfy zero-boundary conditions 


lim u(x, xo; T) = lim u(x, Xo; T) = 0. (3.205) 


Hence, the Green’s function corresponding to equation (3.185), with x, > 0 and xy > œ, 
obtains by the choice y,(x; s) = 1,(W2sx) and y(x; s) = K„(v2sx), since (for positive 


5The Bessel process obeying dx, = Adt+ U9./x,dW, with arbitrary nonzero constant parameter vo is obtained 
from equation (3.203) by making a scale change in the order and in the time: A > A/a, t > at, where œ = vp /4. 
In particular, by simply changing A > 4A/v2 and t > 11/4, all the formulas for the Bessel process with parameter 
Vo follow from those explicitly given for the process obeying equation (3.203) where vo = 2. 
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order u) I,,(z) > 0 as |z| > 0 and K,,(z) > 0 as |z| — œ for generally complex z. In 
particular, I, (/2sx) > 0 as x —> 0 and K,,(/2sx) — 0 as x > œ, for any value of s. The 
Wronskian of these two functions is W(x) = —1/2x, so pW = —1. Combining this into 
equation (3.184) gives 


1, (V2sx)K,,(/25%9), Xx < Xo 
G(x, X35) = (3.206) 
K,(V2sx),(/ 25%), Xp <x. 


Note that this function has been constructed to match the zero-boundary conditions at x = 0 
and x = œ. For the Bessel process the multiplicative factor in equation (3.185) is simply 
(x/xp)””; hence equation (3.185) reduces to 


x B/2 
G(x, X93 5) = (=) G(x, Xp3 5), (3.207) 
Xo 
giving 


u/2 1,(W2sx)K,,(/2sx9), O<x<X 
G(x, xa; 5) = (=) (3.208) 
40 K, (v 2sx) L, (25x0), Xp SX <0. 


Observe from equation (3.206) that the symmetry property G(x, xo; s) = G (xo, x; s) is evident 
by interchanging x with xj. This is consistent with the fact that the Sturm—Liouville operator 
£ is self-adjoint. Note that this symmetry property is not true for the original Green’s function 
G in equation (3.208), as expected since the Fokker—Planck operator £ in equation (3.169) 
is not self-adjoint in this case. 

From the theory of Section 3.6 we know that the inverse Laplace transform (with respect 
to s) of this function will yield the density according to equation (3.174). We now proceed to 
compute the Bromwich integral analytically using standard techniques of complex analysis. 

In proceeding further, we use a known fact that Z,K,„ (for all x < x9 or x9 < x) within 
G(x, xo; 5) is analytic on the complex s-plane, with the exception of a (square root) branch 
point at s = 0. For this reason we need to introduce a branch cut along some branch or ray 
emanating from the origin of the complex s-plane. It is convenient to choose the principal 
branch cut defined by args = m along the negative real axis and to consider points on the 
complex s-plane with |args| < m. We therefore extend the Bromwich contour to that of a 
closed contour that bypasses the branch cut, as in Figure 3.10. Note that this same contour 
was used earlier for the Wiener process. 

The Bromwich integral in equation (3.174) corresponds to the line segment MN. Since 
G(x, Xo; 5) is analytic everywhere (i.e., no singularities) on and inside the entire region within 
the closed contour for all values of the radius R > 0 as well as for positive parameters 
p, 6, and y taken arbitrarily close to zero, Cauchy’s integral formula gives zero for the loop 
integral. Hence the Bromwich integral is equal to the negative of the sum of all the other 
contour integrals that make up the closed loop. From this fact, the kernel is then given as the 
negative sum of such integrals: 


x 1 y+ioo 


B/2 F 
ult T) = (=) — | e G(x, X93 s)ds 
y 


Xo 271i J y—ic0 


ge a Q P _ 
Tg Am G(x, xo; s)ds. (3.209 
(=) lhk +l, = 5 +f, |e (x, Xo; s)ds. ( ) 
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FIGURE 3.10 The closed contour integral for the Laplace inversion of the Green’s function for the 
barrier-free case with a branch cut. 


Although this integrand involves nonelementary special functions, the steps that follow actu- 
ally make use of standard techniques to reduce this sum of seemingly complicated integrals 
to an analytically tractable form. Taking limits R —> oo and y, p, 6 — 0, it readily follows 
that the C# integrals vanish, since along the semicircular contours s = Re’®, 3 OLS Ts 
hence |e°”| = e27? — O as R —> oo with cos 0 < 0. The integrand therefore approaches zero 
as R — oo, since I,,(z)K,,(z) ~ 1/2z as |z| + oo from the leading-order asymptotic expan- 
sions of the modified Bessel functions. The C, integral for the segment QQ’ with s = pe”, 


—T < 0 < T, also vanishes as p — 0. In particular for x < Xo, 





| Í dse” G(x, x; s)| < x Í LG 2pxe®)K,(\/2pxye)|d0. (8.210) 
Cp MT Ja 





Since u > 0, |I,,(./2pxe)K,, (,/2pxge"/”)| > const. (independent of p) in the limit p + 0. 
The same result applies when x > x9; hence the C, integral vanishes in the limit p —> 0. The 
only nonzero integrals are along the branch cut corresponding to the PQ and Q’P’ segments, 
where s = rei” (J/s =i/r) and s = re” (./s = —i,/r), respectively, in the limit 6 > 0, 
with p < r < R. In the limits p — 0 and R > œ the two integrals are combined to give the 
real-valued integral 





y ) ie eT [G(x, xo; e771) = G(x, xo; e7r)] dr, (3.211) 
0 


1 
u(x, Xo; T) = E = 
where 
g l L, (~i~ 2xr)K „(—iy2xor), X < Xo 
Gape s (3.212) 
K,,(-iv 2xr)I (—iy2xor), Xo <x 


and G(x, x9; e'7r) is given by the complex conjugate expression. We note that the integral 
involves the value of the branch cut discontinuity (or jump discontinuity) of the Green’s 
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function along the entire cut. This is typical of Green’s functions for barrier-free kernels, as 
we have seen in the simpler case of the Wiener process. 

The integrand in equation (3.211) is readily simplified by computing the jump disconti- 
nuity by use of the identity 





1 ! , ; , 

a [Z.(—ia)K,(—ib)— I,,(ia)K,, (ib)] = J„(a)J„ (b) (3.213) 
for any real a,b and where J, are the ordinary Bessel functions of the first kind. Note that 
since this expression is symmetric with respect to interchanging a and b, it follows that the 
integral simplifies to 


u(x, Xo; T) = = 


1 H/2 aoo 

; (=) f e™ J (V2xr)J,,(/2xr)dr (3.214) 
Xo 0 

for any x, X9 > 0, irrespective of the relative magnitude of x and x. This result is now 

simplified further by applying the integral identity (3.359) in Appendix C of this chapter with 

choice a= 7, B = /x/2, Y = ./Xp/2, finally giving the known exact closed-form expression 

for the barrier-free kernel: 


5 en &+%0)/27 


eee (=) — (V517). (3.215) 





2 
The Case of Two Finite Barriers with Absorption 


Here we consider homogeneous zero-boundary conditions at arbitrary finite endpoints x, 
and x, with 0 < x; < xy < œ and thereby obtain the kernel, denoted by u(x, Xo, Xz, Xg; T), 
for two absorbing boundary conditions (i.e., a double barrier) at finite values x = x, and 
x = Xy. In our notation we explicitly denote the dependence of u on the endpoint values. The 
boundary conditions imposed on the kernel are 


U(x = E E T) = U(X = Xy, Xo, Xp, X yi T) =O. (3.216) 


Hence, the Green’s function corresponding to equation (3.185) obtains by the choice y,(x) = 
¥ (45 s) = Pu (Xp, x; 8) and y, (x) = Yo (x; 5) = P, (Xy, X; s), Where we have defined the function 


p, (a, b; z) = 1, (V2az)K,,(W/2bz) — K,, (W2az)I, (/2bz) (3.217) 


for generally complex z and real parameters a, b. The two independent solutions are simply 
linear combinations of the /, and K,, functions satisfying the respective zero-boundary 
conditions: y(x = x,) = 0, y (x = xy) = 0. In this case the Wronskian is shown to give 
PW D1, Yo] = Pu (Xn X1; 5) = —Py (XL, Xy; 5), and hence the Green’s function is given, via 
equation (3.185), as 


Pa Xr: X; 5), (Xu; Xo; $) 
x\2 Pu (XL Xy; $) 
G(x, Xo; S) = (=) 


Xo 





; Xp SX SX 
(3.218) 
Pu (Xn; X; S)Py (Xp Xo; 5) 


Pur: Xa; 8) 





5 Xo SX < Xy. 


In order to obtain the transition density we will invert this Green’s function again with 
the use of a closed contour integral while taking into account all singularities of G on the 
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complex s-plane. First note that s = 0 may be a possible branch point due to the ./s argument. 
Since the functions ¢, are analytic on the entire s-plane (excluding possibly the branch cut), 
all other singularities of G are the zeros of the denominator ¢,,(x,, Xy; 5) in equation (3.218). 
From equation (3.217) and using properties of the Bessel functions, we see that the zeros 
must lie along the negative real axis. Indeed, putting s = —e for any real €e > 0 gives 


Pp (Xp Xy; —€) = K, (ižn) L, (i%_) — K, (iF )L, (ip). (3.219) 


To compact notation, we have denoted the real quantities x; = /2€x,, Xy = y 2€Xy. Using 
the properties 7, (ix) = i*J„(x) and K,, (iy) = F[/_, (iy) — Z, (iy)]/ sin mu for real x, y gives 
1,(ix)K (iy) = 5 cse(mTu)J (x) [J 9) — e'*J,,(y)] for any noninteger u. Using this we 
obtain the identity 


which applies for all u (integer values included), where the usual limiting procedure (i.e., 
analytic continuation in u) is used in the definition of the Bessel K, and Y, functions for the 
case of integer order u. The functions Y, are the ordinary Bessel functions of the second kind 
of order u (see, for example, [AS64]). In contrast to the monotonic and positive hyperbolic 
Bessel functions for real arguments, the ordinary Bessel functions are oscillatory. In particular, 
the functions on the right-hand side of equation (3.220) involving the difference of products 
of ordinary Bessel functions (these are sometimes referred to as cylinder functions) have a 
countable infinite number of zeros. The zeros of the denominator of the Green’s function are 
hence all real and negative. To simplify notation, we shall denote these zeros by €, = €,, ,, 
where it is implicitly understood that these are really the nth eigenvalues for given u. The 
equation determining these zeros (i.e., the eigenvalues of the Sturm—Liouville operator with 
zero-boundary conditions at two finite endpoints) is therefore ¢, (x ,, Xy; 5 = €,) = 0; i.e., 
from equations (3.219) and (3.220), 








J V enw) Yu (Velez) — Juv 21€ nlx) Y, (V 2le, |e) = 0. (3.221) 


Solving for |e,,| gives the eigenvalues €, = —|e,| for all integers n > 1. The eigenvalues form 
a sequence of negative values along the entire negative real axis. Note that this is entirely 
consistent with a regular Sturm—Liouville boundary value problem. These zeros occur in 
increasing order |e,| <|e€,| <..., and are readily obtained by standard numerical procedures. 

We are now in a position to compute the Bromwich integral analytically using a similar 
contour integration procedure as before. However, in contrast to the barrier-free Green’s 
function of the previous section, G has isolated singularities at the zeros of the denominator 
at s = —|e,| along the negative real axis (args = m). At all other points not lying on the 
branch cut and for s Æ €,, G is analytic, since it is a ratio of two analytic functions with 
nonzero denominator. Although we have freedom in the choice of branch cut, the choice of 

30 


cut along the negative imaginary axis with arg s = + is convenient. We therefore consider 
-7 <atg s< and close the Bromwich contour and apply the residue theorem to the 


3m 
2 
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FIGURE 3.11 The contour integral for the Laplace inversion of the Green’s function for the case of 
absorption at two finite endpoints with branch cut along the negative imaginary axis. 


loop integral in Figure 3.11. Applying the Cauchy residue formula to the closed contour in 
Figure 3.11 gives the Laplace inverse of G, and hence the kernel, as 


1 


u(x, Xo, XL, Xy; T) = = 


y+ioo 
Í e” G(x, xo; s)ds 
y—io 


= doe !!"Res G(x, x93 s = —|€,l) 
n=1 
1 
-zal f +f +f +f, |e 0ta (3.222) 
27 L Ice Ir - JO 
where the first term involves a sum over the residues of G, as a function of s, at all eigenvalues 
s=€,,n=1,..., 0. In this formula the limits R > œ and y, p, 6 > 0 are implied. Taking 
such limits it readily follows that the semicircular Cz integral, with s = Re’, 5 <0< az 
approaches zero. This obtains from the property of the , functions, which in the limit 
R— œ gives |e G(x, xo; s)| > Ke7 R104 BYR /./R — 0, where x, B are positive constants 
dependent on x, x9, Xz, Xy. Then using the property lim, ,. G(x, Xo; 5 = pe’) — const., 
independent of p, a similar argument as used in the previous section allows us to conclude 
that the CG integral approaches zero. The sum of J+ and I~ integrals in the limits p > 0, 
R— œ give 


1 ae . 3m T 
— f e" [G(x, xo; rei?) — G(x, Xo; re‘? )|dr. (3.223) 
27 Jo 
By completing a circuit around the origin, however, one easily proves the property 


P„(a, b; ez) = p, (a, b; z) (3.224) 


for any complex z Æ 0 and any positive real a, b. This shows that there is no jump discontinuity 
in the function ¢, along any choice of branch cut. Since G is a function of a product and a ratio 
of such functions, G also has no jumps. Indeed, applying the last identity with the particular 
choice z = re~'?,, equation (3.218) gives G(x, x9; re’? ) = G(x, xo; re™'? ). Equation (3.222) 
hence reduces to only the sum of residues: 


U(x, Xos Xr Xg; T) = X el“ Res G(x, xo; s = —le, |). (3.225) 


n=1 
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The residues of the Green’s function are evaluated analytically as follows. From the 
analyticity of the g, functions we observe that every point s =e, is a simple pole of 
G(x, xo; 8); i.e., Pu (X,0 ; 5) has simple zeros at every s = €,,, as will be shown. Hence we have 


Py Xr; x; €,) Pu (XH Xo; En) 











< 
“ Gy (Xz .XH5) | ? * S Xo 
L ðs S=En 
Res G(x, xo; € ) = (=) (3.226) 
Xo Pu (X> X; En) Pu (XL, Xo; En) 
z 7 Xo Sx. 
9G y (xL: XH3S) | 
ðs S=Eq 
T te thi id Oy (XL: XH;s) —  ÎPulXLXH;—E) ; RS d 
o compute this residue we use lee = E lezje, Since €, = —|€,|, an 
we hence consider 
T = = 2 z 
Pur Xy; ©) = Flu nYa Or) Ip) Yn), (3.227) 


which follows from equations (3.219) and (3.220) for real € > 0. Differentiating this equa- 
tion at € = |e,| while making use of equation (3.221) and the recurrence relations J; (z) = 


(M/Z) Iu (2) — Juss (2), Yg) = (u/z)¥,,(Z) — ¥41(2) gives 


P(X, Xy; —€) 
ðe 





E i [Panan = Yp A) a n)] 
le,,| 


e=|e,| 
rete OAC CAC] 


= 1 oS an 
lelL YG.) YG) J 





(3.228) 


where x, = /2\€,|x,, Xy = V2\€,|xq. The last expression obtains from the identity 
F(Z) Yugi (2) —Iu4i (ZY, (Z) = —2/ 7z. Note that the expression in equation (3.228) is readily 
seen to be nonzero, since xy > x, and the zeros |e,,| = |é,,,,| of equation (3.221) cannot also be 
zeros of the Y, functions for given order u. All poles s = —|e,,| are therefore simple, justifying 
our assumption. Substituting the expression in equation (3.228) into equation (3.226) and 
again using equation (3.221) with some tedious algebraic manipulation gives the closed-form 
compact formula for the residue: 








Res G(x, #58 |e.) = (<) swe. (3.229) 
where ¢, (x) are (eigenfunctions) given by 
(3) =M [nE ~My Gu). (3.230) 
with normalization factor 
N= e 7 (3.231) 


Here we have used the shorthand notation z = y 2ļe„|z. Note that the result is valid for all 
X, Xo > 0 values, for it does not actually depend on the relative magnitude of x, x). As is the 
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Dud do 


case in all eigenfunction expansion solutions for a Sturm-Liouville problem (for G in this 
case), the occurrence of ¢,,(x)@,,(%>) is symmetric with respect to interchanging x and Xp. 
Finally, inserting this residue formula into equation (3.225) gives the kernel for the domain 
X; <x < Xy subject to double-ended zero-boundary conditions at x = xz, Xy as an exact 
closed-form eigenfunction series: 


X 


) i D el h(x), (xo). (3.232) 


U(X, i Xg 7) = ( 

Xo 
From the distribution of increasing values of |e„| with n, as can be shown from equation 
(3.221), this series converges fairly rapidly for finite values of time 7, particularly for large 
values of 7 relative to the first value |e,|. It is interesting to remark that the complex 
analysis approach to the Green’s function methodology also automatically guarantees that the 
eigenfunctions @(x) are normalized, since in the limit r —> 0 the density u must approach 
the Dirac delta function 6(x — xo). From another perspective, this result is also entirely 
consistent with Sturm—Liouville theory as well as spectral theory for the eigenvalue problem 
corresponding to the operator L, defined earlier. A direct, yet algebraically very tedious, 
proof of the normalization J d,, (x), (x)dx = 6,,, also follows by use of appropriate 
integral properties of products of the Bessel J and Y functions, as provided in this chapter’s 
Appendix C. 


The Case of a Single Upper Finite Barrier with Absorption 


This situation corresponds to zero-boundary conditions at x = 0 and at a finite upper endpoint 
X= Xy, 0 < Xy < œ. We shall denote the kernel for this case by uy(x, Xo, Xy; T). The upper 
endpoint turns out to be an absorbing-boundary condition (at a single upper barrier). The 
boundary conditions imposed on the kernel are 


Uy (xX = 0, Xo, Xq3 T) = Uy (X = Xy, Xp, Xp; T) =O. (3.233) 
Hence, the Green’s function corresponding to equation (3.185) obtains with choice y, (x; 5) = 


I,(V2sx) and y(x; s) = ~, (xq, x; 5). The Wronskian of these functions gives pW [y,, y2] = 
—1,,(,/25xq); hence the Green’s function is 


LOU 6H) K (Xo) — K, (ža), (*0)] 
x z 1,(XH) : S 
G(x, Xo; 8) = | — oe ? NETSE 
*o L a On) Ku) — Ky On) 1,2) 
1,(Xy) 
where we use shorthand notation z = /2sz. We can split this into a difference of two 
functions, 





(3.234) 





> 0 — > 


I(x) K(X), X< Xp 


G(x, Xo; 8) = (=) i — g” (x, Xo; $), (3.235) 


“0” (LEKE,  % <2 


NIE 


where the first part corresponds to the barrier-free Green’s function of equation (3.208) and 
the second part is 





F n (x 2 K (25x4) 
g (1959) = (=) ETF 1, (W2sx)1, (\/25%9). (3.236) 
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The inverse Laplace transform is the difference of two Laplace inverses. The first part is 
exactly u(x, Xo; T) of equation (3.215) for 0 < x, x) < xy, while the second inverse Laplace 
contour integral is computed using exactly the same methods as in the previous section, i.e., 
using the closed contour integral in Figure 3.11 since g” is analytic, except for the branch 
point at s = 0 and at simple poles along the negative real s-axis. The simple poles of g” 
are s = —|e,|, where €, =€,,,, n= 1,..., are now simply the zeros of the ordinary Bessel 
function 


un? 


Ja (V2le,|xu) =0. (3.237) 


We note that the value e, is the first nonzero root. 
Using the residue theorem, the Bromwich contour integral reduces to 


L'[g" (x, xo; 8) ][7] = Doe!" Resg™ (x, xo; s = —le nl) 
n=1 
1 oo H , ; —iT) _ gH ; : iT 
ph [À eni mai eee i N g, 
2 Jo TIL 





(3.238) 


The branch cut discontinuity in g” is readily computed using the properties of the modified 
Bessel functions for purely complex arguments, namely, 





im (2N Kaiv) y 
HCx, xo; re™™) = x 1,(-iv 2rx)1,(—iy 2rxo 
=(3) ipa E 


a (o) | an I,(-iWV2rx)K, (iy 2rxy), (3-239) 





giving 





Lg om re) 8 rae (: ) “Ig 2r%) Jy (V2), 


Ti 


where the identity in equation (3.213) has been used. Inserting this expression into the 
integrand shows that the integral term is exactly the barrier-free kernel u(x, xọ; T), as in equa- 
tion (3.214). Taking the difference of Laplace inverses for the two terms in equation (3.235) 
therefore cancels out the barrier-free portion and we are left with 


Uy (X, Xo Xy; T) =— >> elel Resg# (x, xo; s = —le,|). (3.240) 


n=1 


Let g” = (x/x9)“/?g8,; then the residues at the simple poles are given by 


_ Kyiv 2enl ent iy 21 en) Suv 21€nl%0) 








Res 2” (x, xa; s= — En 3.241 
"(x xa s= le) “uE (3.241) 
de e=le,,| 
Upon evaluating the derivative and using the relation 7, (ix)K,, (iy) = —5J,,()¥,,(y) we have 








Y (,/2le 
Res Hx. 298 =e) = 7, / & uly lent) y STED). 6242) 
2x4 J Ze, an) 
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This expression is simplified via the Wronskian property of the ordinary Bessel functions and 
by making use of equation (3.237); i.e., using 


/2lelxn 
Y, V2le lrn) = ae (3.243) 


we obtain 








a) EE) 
xpi (V 2|€, |<) 


Inserting this expression into equation (3.240) finally gives an exact closed-form eigenfunction 
series solution for the kernel 


Res g” (x, xo; s = —€,|) = ( (3.244) 


Xo 


Uy (x, Xo, Xg; T) = (=) : x elel h (x), (x0) (3.245) 


in terms of the normalized ordinary Bessel eigenfunctions: 


Ja (y 21€nlx) 
/ XI uri by 2|€,|XH) 
In closing this section we note that this result is also readily proven to obtain as the limit 


x; — 0 in the double-barrier solution u(x, Xo, X}, Xy; T) of the previous section. We leave it 
as an exercise for the interested reader. 





bn(x) = (3.246) 


The Case of a Single Lower Finite Barrier with Absorption 


This last case corresponds to zero-boundary conditions at a lower finite endpoint x, > 0 and 
at infinity with 0 < x, < œ. The domain of the solution is the semi-infinite interval [x,, 00). 
We denote the kernel by u; (x, Xo, Xz; T). The imposed boundary conditions are now 


Uy (X = Xp, Xos Xz; T) = Uz (X = œ, Xo, Xz; T) = 0. (3.247) 


For the limiting value x, = 0 the solution is simply that of the barrier-free (no absorption) 
problem; for x, > 0, x, is a single lower absorbing barrier. The Green’s function corresponding 
to equation (3.185) obtains with choice y; (x; s) = „(xz x; 5) and y, (x; s) = K,,(W2sx) since 
K (28x) — 0 as x > oo. The Wronskian of these functions gives pW[Y,, Y2] = K,,(/28x,); 
hence the Green’s function is 


[K O) — 1,1) Ky 1K, Go) 








<x 
x 5 K,,(x;) =" 
G(x, X93 8) = (=) (3.248) 
*o KO) KO.) 1. %0) = 4.01) Ky 0) ed 
K (x1) ; na 


Here again we use shorthand notation z = y 2sz. Rewriting, we have 


G(x, X93 8) = G? (x, xo; 5) — g” (x, Xo; $), (3.249) 
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where g!(x, xo; 8) = (X/x) 2 8" (x, xo; $), 


2 I,(/25x,) 
g (x, Xo; 5) = z K „(V2sx)K, (V 2sx0), (3.250) 


and G° denotes the barrier-free Green’s function given by equation (3.208) for x; < x, x) < œ. 

Laplace-inversion of the first term, G°, gives the barrier-free contribution u(x, x9; T) of 
equation (3.215). Laplace-inversion of the g4 term follows by using the same contour as in 
the barrier-free case, i.e., Figure 3.10. With branch cut along the negative real axis, s = |s|e’®, 
— < 0 < m, and the function g% is hence analytic except at the branch point s = 0 and cut 
along arg s = m. From the properties of the modified Bessel functions of the second kind, 
we know that K,,(z) has no zeros in the region |arg z| < > for real u (see [AS64]). Hence 
the denominator K,,(\/2sx,) = K,,(\/2|s|x,e”) is never zero at every point on and inside 
the closed contour of Figure 3.10. We therefore deduce that (x, x9; s) has no poles and is 
analytic on and inside the contour. Using the residue theorem and following similar steps as 
in the previous cases, the Bromwich contour integral reduces to give 


= 1 i —rT —iT im 
L'R œ xo; DEN = 5 f 7B, xo; re) — BE (a, xo; ear. 


The branch cut discontinuity in g is readily computed by making use of the properties 
1,(—ix) =e"? (x) and 





ijn ; Ti ‘ 
eK (ix) = F(a) FY] 
for real u, x. After some tedious algebraic manipulation, this gives the imaginary part 
: 1 ; ; 
Im 8 (x, xo; re 7) = 5518" (Œ, xo; re") — B(x, xo; re™)] 
i 


7 DEEP G, Dp E) + OD (% Xo) Y, E) 





2 Ji) + Y?) ; 
where we define new functions 
PP (x, y) = POO) YOYO), (3.251) 
PP (x, y) = POO HOY) (3.252) 


with shorthand notation z = ~y 2rz. 
An exact closed-form expression for the kernel is therefore 


tr (Atg X35 T) = U(X, ta T) — U(X, Xp, iri T) (3.253) 


where u(x, xọ; T) is the barrier-free part as given by equation (3.215) and u(x, Xo, Xz; T) has 
the integral representation 


i( ‘N= T © ae Iu Eby E Fo) JuK) + OP WYG) 
UWX, Xo, XL; T) = f e FG,)+¥2 EA 





2 \ Xo 
with x; = /2rx,, X = V 2rx, Xo = y 2rxọ. The zero-boundary condition at x = x, is readily veri- 
fied. In particular, setting x = x, while making use of the functions oy (X,,, Xo) and of (X,, Xo). 
the integrand in our integral representation reduces to e~'” J, (X,)J,, (Xo). From equation (3.214) 
we arrive at u(x = X}, Xo» Xg; T) = u(x; Xo; T); hence uz (xX = xz, Xo, X; T) =O 
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3.8 


3.8.1 


New Families of Analytical Pricing Formulas: “From x-Space 
to F-Space” 


In this section we present a mathematical framework for generating various families of exact 
analytical pricing kernels for nonlinear state-dependent diffusion processes. We shall refer 
to this construction as the diffusion canonical transformation methodology. The method is 
a reduction approach that essentially reduces the more complex state-dependent diffusion 
problem (i.e., the so-called F-space problem that we wish to solve) into a simpler underlying 
diffusion process (i.e., the x-space problem). One of the basic ideas of the approach is to 
consider an x-space diffusion process that is analytically tractable, e.g., for which Green’s 
function methods can be used to arrive at a solution. Pricing kernels for F-space then arise 
as a result of having obtained transition kernels for an underlying x-space process. As seen 
next, the technique makes use of a special combination of transformations. 


Transformation Reduction Methodology 


Throughout we shall consider time-homogeneous drift and volatility functions having no 
explicit time dependence. Hence, without loss in generality we set initial time tọ = 0, and 
in particular for the x-space transition probability densities we simply write u(x, x9; T) 
[or u(x, xo; t)] in place of u(x, t, Xo, to), and U(F, Fo; T) [or UCF, Fo; t)] denotes the F'-space 
transition density or pricing kernel. The basis of our reduction methodology arises from 
Lemma 3.1 and ultimately Theorem 3.1 relating fundamental solutions of the Fokker—Planck 
(or Kolmogorov) equation under two different stochastic processes. 
Consider an underlying diffusion process with SDE 


dx, = A(x,)dt+v(x,)dW,, (3.254) 


where W, is a standard Wiener process. As already mentioned, the term v(x) is the x-space 
diffusion function or (generally state-dependent) volatility function, while A(x) is the drift 
function. The x-space kernel u = u(x, Xo; T) satisfies the corresponding forward and backward 
Kolmogorov PDE (3.169) and (3.170). In F-space (e.g., forward-price space) we are interested 
in finding pricing kernels for the corresponding SDE: 


dF, = o(F,)dW,, (3.255) 


where o(F) is the F-space diffusion function or state-dependent volatility function, and 
W, is a standard Wiener process under some new measure. The F-space kernel U(F, Fo; t) 
satisfies a new time-homogeneous forward (and backward) Kolmogorov PDE for the process 
described by equation (3.255). An important question that arises is: Can we develop new 
families of solutions U(F, Fo; t), corresponding to new volatility functions o(F), by making 
use of (known) solutions u(x, Xo; t)? The answer is yes, and it is specifically contained in 
what follows. 


Lemma 3.1. Let u = u(x, xo; t) be a fundamental solution to the Fokker—Planck (forward 
Kolmogorov) equation for the x-space stochastic process 


ðu 12 “Vy 
apo 5 a (x n) are (aon), (3.256) 


with Dirac delta function initial condition lim,_,9 u(x, Xo; t) = 6(x — xo), with appropriate 
boundary conditions at the endpoints of an interval that may be finite, semi-infinite, or 
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infinite. Let x = X(F) be the invertible transformation with invertible mapping F = F(x) and 
having positive semidefinite derivative dX(F)/dF = v(x)/a(F) on the interval. Assume that 
the function defined by 


A(x)? v' (x) 
a A 


+ =(0(F)o" (F) —v(x)v" (x)) + l a (x — v (E»)| (3.257) 








a(x, F) = ze )+ 
I 
ie) 


is a constant a(x, F) = a, independent of x with F = F(x), hence also independent of F with 
x = X(F). The related Fokker—Planck (forward Kolmogorov) equation in F-space, 


aU 1# 
a SJP (aru), (3.258) 


for the stochastic process defined by equation (3.255) then admits a fundamental solution 
U = U(F, Fo; t) of the form 

v(x) Lo E v(x)/o(F) fe A(z) 
o(F) Y(Xo)/O(Fo) x V) 


where x = X(F), xọ = X(Fo), with corresponding Dirac delta function initial condition 
lim, „o U(F, Fo; t) = 6(F — Fy). 


U(F, Fo; t) = 








exp [e a:| u(x, X93 t), (3.259) 


It is important to note that an equivalent result also obtains for the case that the mapping 
x = X(F) is assumed to be monotonically decreasing with dX(F)/dF = —v(x)/o(F), where 
v(x), o(F) are both positive semidefinite functions. Moreover, under fairly general boundary 
conditions (such as homogeneous conditions) the kernels u and U are also solutions to 
the corresponding backward time Kolmogorov equations; i.e., an equivalent result of the 
foregoing is a statement involving the adjoint or backward time equations. Note also that 
boundary conditions in the F-space kernel can be imposed by setting appropriate boundary 
conditions in the x-space kernel via the mapping x — F. In fact, by taking the simple Wiener 
process as underlying x-space process, in Section 3.5 this procedure formed the basis for 
deriving exact analytical pricing formulas for standard Equropean as well as various barrier 
options for the linear and quadratic volatility models. Under fairly general situations, unique 
solutions for U satisfying homogeneous boundary conditions are obtained by simply matching 
(i.e., uniquely mapping) these homogeneous conditions in u. A direct proof of this lemma is 
contained in Appendix A of this chapter. 

It is crucial to note that equation (3.257) implicitly defines a special class of invertible 
transformations that are used to generate our next main result. It is useful therefore to introduce 
a formal definition for such a variable transformation, which we shall refer to as a diffusion 
canonical transformation. One definition based on Lemma 3.1 is as follows. 


Definition 3.1. Let p be an arbitrary constant, and let the (volatility) functions v(x) and o(F) 
be positive semidefinite twice differentiable functions defined on appropriate finite, semi- 
infinite, or infinite domains of x- and F-spaces, respectively. Furthermore, let the function 
a(x, F) be defined by equation (3.257), where X(x) is a differentiable (drift) function of x. 
A diffusion canonical transformation is an invertible transformation x = X(F) such that 

dx v(x) 


F)=— d — =+ : 
Ee ae dF o(F) 
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This definition now leads us to an equivalent, yet more directly useful and transparent, 
definition, as follows. Note that since v(x)/o(F(x)) is positive (or negative) semidefinite, 
we can set »(x)/o(F(x)) = c(#(x))’, with arbitrary constant c 4 0 and twice differentiable 
function y(x). Differentiating w.r.t. x, using F’(x) = (dx/dF)~' = a/v (note: without loss in 
generality the map is assumed either monotonically increasing or decreasing), and dividing 
both sides by ci(x)? gives 











WC) lay 
v(x) Wa) 2 (v (x)-0 (F)). (3.260) 
Squaring gives 
2 y (x) j T 1 $ 2 1 1 1 2 
v(x) (2) = g” (x) —20'(F)v' (x)+o'(FY], (3.261) 
and multiplying the previous equation by v’ (x) gives 
v(x)v' (x) os = sl (x)? — v'(x)o'(F)]. (3.262) 


Subtracting this last equation from the previous one gives 


y(x) 
pa) 


a 
Wa) 





Tra’ —o'(F)]= -vo ( ) + v(x)v' (x) (3.263) 


Now differentiating yf’ (x)/ys(x) using equation (3.263) and multiplying by v(x)? while using 
the previous expression, we have 


1 n n = 2 (x) i 1 y(x) 2 y" (x) 
P-rom 2) -ero -y (OE 


Note that the left-hand side of equations (3.263) and (3.264) are contained in the expression for 
a(x, F), hence combining equations (3.263) and (3.264) into the expression for a(x, F) = —p 
and simplifying gives 





| (3.264) 











=y" (x, p) + V(x, p)W(x, p) =0, (3.265) 
where 
Vix, p) =- ai [xo + ie -2A AD 4 2|; (3.266) 


Here we have denoted w = (x, p) to stress the explicit dependence on the constant parame- 
ter p. Equation (3.265) is a homogeneous linear second-order ordinary differential equation.° 

Based on the development directly preceding, we now present another equivalent, and 
more transparent and practical, definition for a diffusion canonical transformation. 


®The reader familiar with quantum mechanics will observe that equation (3.265) is essentially related to a 
one-dimensional time-independent Schrodinger-like equation. 
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Definition 3.2. Let p be an arbitrary constant and v(x) and o(F) be positive semidefinite 
twice differentiable (volatility) functions defined on some appropriate finite, semi-infinite, 
or infinite domains of x- and F-spaces, respectively. Furthermore, let (x) be a differen- 
tiable (drift) function of x. A diffusion canonical transformation is an invertible variable 
transformation x = X(F) such that 








dx č v(x) 
DR o(F) (3.267) 
and 
_ v(x) 
o(F) = oy [wee pyr?” (3.268) 


with arbitrary constant 0) #0 and w(x, p) satisfying equation (3.265) with V(x, p) 
given by equation (3.266). The inverse transformation F = F(x) follows from F'(x) = 
+o(F(x))/v(x) = +05/[W(x, p), and integrating gives 
= * dz 
F(x) =F+o, i. awe ty (3.269) 
"Se [W p)P 
with F = F(x) and x as an arbitrary constant. The + factor allows for two possible branches 
of either monotonically increasing or decreasing maps. 














In the analysis that follows throughout the rest of this section it is convenient to work 
with a slightly modified version of y, by defining 





(x, p) = W(x, p) exp (- AG) ax). (3.270) 
v(x)? 

The integral here is left as indefinite since any choice of definite integration would simply 

lead to an overall multiplicative factor. From equation (3.268) we therefore conclude that 

a diffusion canonical transformation is one that relates the two volatility functions via the 

(generally implicit) relationship 





ov(x)exp(—2 f AG) dx) 


v(x)? 





o(F)= = : (3.271) 
[w(x, p)? 
with x = X(F) and where à = u(x, p) is readily shown to satisfy 
1 Re cae 
z TA TA # pe =0. (3.272) 


Indeed equation (3.272) follows by direct differentiation and substitution of equation (3.270) 
into equation (3.265). As we will see, equation (3.271) is rather central to the whole trans- 
formation methodology. Equation (3.272) actually turns out to be the homogeneous adjoint 
equation for the corresponding x-space time independent Green’s function discussed in 
Section 3.6, i.e., the homogeneous version of equation (3.173), with 6(x — x9) replaced by zero 
and Laplace transform variable s = p. A set of two linearly independent solutions for à follow 
immediately from the Green’s function, as shown in Section 3.6. Using equations (3.269) 
and (3.270), the mapping F = F(x) can now also be rewritten explicitly in terms of ù: 


-2f 4@ az 
xel 


F(x) =F+o, Í e (3.273) 
x [8z pf 
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with inverse x = X(F) given (generally implicitly) by inverting this relation using either 
branch (+ sign branch for monotonically increasing or — sign branch for monotonically 
decreasing). 

Using these equations we now summarize the main result into the following important 
main theorem, which follows as a direct consequence of the preceding lemma. 


Theorem 3.1. (Reduction-Mapping for Pricing Kernels) Given an x-space process satis- 
fying equation (3.254), with transition probability function u(x, Xo; t) as fundamental solution 
to the corresponding Kolmogorov (forward or backward) equation, and an F -space process 
described by equation (3.255), with transition probability function U(F, Fo; t) as fundamental 
solution to the corresponding (forward or backward) Kolmogorov equation, the fundamental 
solutions are related as follows: 


v(x) W(x, Pp) p 
G(F) (a, p) e u(x, xg t), (3.274) 








U(F, Fo; t) = 


where x = X(F), x = X(Fo) are (implicitly) given by the diffusion canonical invertible 
variable transformation defined by equation (3.271), or (3.273), and u(x, p) solves equa- 
tion (3.272), with X' (F) = +v(X(F))/o(F). 





Proof. One way to verify this is to show that U in equation (3.274) solves equation (3.258) 
by changing derivatives w.r.t. F to derivatives w.r.t. x with repeated use of the chain rule 
and using the fact that u satisfies equation (3.256). Although straightforward, this process is 
tedious. A simpler proof follows directly from the foregoing lemma. Indeed letting a(x, F) = 
—p in equation (3.257) gives the map x = X(F) defined by equations (3.271) and (3.272), as 











shown earlier. Hence a = —p in equation (3.259). Moreover, using equation (3.271) we have 
3 joe LOO) | ( va) ) (2 r 
exp | = log = 
2 © W(Xo)/O(Fo) o(F)} \o(Fo) 
î(x, p) ( * AC) ) 
=- exp dz). (3.275) 
(Xo, p) I, v(z)? 


Substituting directly into equation (3.259) eliminates the exponential term, giving equa- 
tion (3.274), where we assume u(x, p) is either positive or negative semidefinite. Note also 
that generally, and without loss in generality, the ratio of the volatility functions v(x)/o(F) 
is assumed to be positive definite; i.e., both volatility functions can be positive or nega- 
tive semidefinite. Otherwise, one simply takes the absolute value of the Jacobian of the 
transformation. O 


It is important to point out the basic structure of equation (3.274) and how this relates to 
the asset pricing theory of Chapter 1. That is, the F-space transition density U is related to 
the x-space transition density by a combination of two terms. The first factor, v(x)/a(F), is 
simply the Jacobian resulting from the assumed variable transformation x —> F. Within the 
framework of stochastic differentials, equivalent martingale measures, and the continuous- 
time asset-pricing theorem discussed in Chapter 1, the second term can now actually be 
identified as a ratio of two numeraires g,/g), where the numeraire at time t is g, = e°'/u(x,, p) 
and the x-space process at time t denoted by x, has value x at time ¢ and value x, at time 
zero. Recall from Chapter 1 that a transition density corresponds to the current price of an 
infinitely narrow butterfly spread pay-off (i.e. a delta function pay-off). Hence by assuming 


3.8.2 
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g, aS numeraire, the asset-pricing formula (1.292) allows us to also rewrite equation (3.274) 
as the conditional expectation at time zero of the delta function pay-off: 


U(F, Fy; t) = E2® [Lare = r| , (3.276) 


t 


Note that process F, is generated from the underlying process x, via the mapping F, = F(x,). 
For an alternative and instructive “proof” of Theorem 3.1 as it relates to pricing measures, 
see Appendix B of this chapter. 

In summary, the foregoing reduction methodology provides exact analytical relationships 
among transition probability densities describing continuous diffusion under classes of differ- 
ent stochastic processes (i.e., x-space and F-space) with different state-dependent volatility 
and drift functions. Note that throughout we present the theory with the assumptions of no 
explicit time dependence for all drift and volatility functions; furthermore it is assumed that 
the drift function multiplying the dt term in the SDE of the F, processes in F-space is zero. [It 
should be noted, however, that generally this does not necessarily imply that F, is a driftless 
(i.e., martingale) process in cases of nonlinear volatility functions o(F).] Extensions that 
further relax some of these assumptions are possible; however, these are not discussed here. 
As we will show, this result provides the main tool for generating a substantial number of 
new families of exactly solvable diffusions and hence for obtaining new pricing kernels under 
multiparameter volatility functions. The fact that o(F) involves multiple parameters can gen- 
erally be seen from equation (3.271), wherein oy and p are two obvious parameters, while all 
other parameters can arise from the underlying x-space drift and volatility functions A(x) and 
v(x), respectively. As is shown later, two other adjustable parameters arise if one considers 
arbitrary linear combinations of two linearly independent solutions to equation (3.272). That 
is, equation (3.272) admits a family of solutions; and since we are at liberty to choose any 
particular solution, every choice gives us a particular volatility function in F-space. 

It is now apparent that if an F, process can be mapped onto an x, process (in the 
“diffusion canonical” sense), then solutions for F-space transition probability densities (i.e., 
pricing kernels) can be obtained by solving the x-space diffusion problem with appropriately 
imposed boundary conditions. Consequently, the functions i that solve equation (3.272) are 
the basic building blocks for ultimately deriving the pricing kernels U(F, F}; t) and hence 
for constructing solutions for the F-space processes. As described earlier, this arises simply 
from application of the theory of time-dependent and time-independent Green’s functions 
to the underlying x-space diffusion problem. For this reason, we also refer to such a func- 
tion u as a generating function. By solving u(x, %9; ft) subject to a judicious choice of 
boundary conditions in x-space, one therefore generates the pricing kernel U(F, Fo; t) via 
equation (3.274) while satisfying required boundary conditions in F-space via the inverse 
transformation F = F(x). The analytical properties of U, such as nonnegativity, integrability, 
and probability conservation, depend upon the x-space drift and volatility functions and the 
choice of p. 


Bessel Families of State-Dependent Volatility Models 


Based on the exact analysis of a nontrivial underlying x-space process and the foregoing 
mapping reduction method, we are now ready to develop new families of analytically exact 
pricing kernels for multiparameter classes of diffusion models. In particular, we shall make 
use of the solutions to the Bessel process obtained in Section 3.7 and arrive at a new family 
of pricing kernels with corresponding volatility models that can be expressed in terms of the 
modified Bessel functions. We shall refer to these new models and solution kernels as the 
Bessel family of volatilities and pricing kernels. 
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The results follow from a straightforward application of equation (3.274) of Theorem 3.1 
starting from the exact form of the generating function u(x, p) in the case that the underlying 
x-space process has volatility function v(x) = 2./x and drift A(x) = A, x € (0, œ). From the 
discussion in Section 3.7, and in particular from equation (3.186), u(x, p) obtains from the 
general solution to the modified Bessel differential equation (3.204), for s = p. Explicitly, 
equation (3.186) with y,(x, p) = 1,(/2px) and y,(x; p)=K,, (./2px), as strictly increasing 
and decreasing nonnegative functions for p > 0, u = à — 1 > 0, gives 


a(x, p) =x "ql, (2px) + qK, (V 2px]. (3.277) 


Throughout we shall assume the family of solutions with g,, g, as real constants and p > 0 
such that à is nonnegative. In this case the map x = X(F) [and its inverse F = F(x)] is 
strictly monotonic on the entire half-line x € [0,00). [Note: For p < 0, the general form 
for the generating function is expressible in terms of ordinary Bessel functions: u(x, p) = 
x? (q, J (/—2px) + gY,,(/—2px)). In this case, however, invertible maps exist only on 
finite piecewise segments along the half-line x > 0 since the J,, Y, functions are oscillatory 
and have multiple zeros.] Substituting a(x, p) from equation (3.277) into equation (3.273) 
and applying a change of integration variable gives 





= z=4/ 2px d 
F(x) =F +20, f 5 (3.278) 


v73 zlq l) + oK, e) ' 


with constant value x mapping into F(x) = F, an arbitrary real constant. Here we have used 
the + branch of equation (3.273) while a similar result follows for the — branch. This integral 
leads to two dual families of exact analytical expressions for the transformation F = F(x). 
This follows directly with the use of the identity 








d 1/q)I 1 
( (1/42 Mul) ) = > (3.279) 
dz \ ql, (z2) + @K, (2) [qty z) + 2K, (z)] 
in the case of the first family, and with the use of 
d ( —(/q))K 1 
( C/K, @) ) = 7 (3.280) 
dz \ql (z) +K, (2) zla) +K, 2) 


in the case of the second family. These general identities follow from the Wronskian relation 
1,(2)K; (2) — K (2), (z) = —1/z. Using equation (3.279) gives 





209/492 
F(x) =c,+ 0 , (3.281) 
| 1+ (2/4) KV 2p3)/1,(/202) 
with q, #0, while use of equation (3.280) gives 
2 
F(x) =o Po! did (3.282) 





1+ (Q1/ 42) (2px) /K,(V2px)’ 


with q, #0, for the first and second families, respectively. Here the constants c,, c, are given 
by c = F — (209/41%)/[L + (2/41) Ky (W20*)/T(V2px)] and c, = F + (209/9192)/[1 + 
(41/42) Iu (/2px)/K,(/2px)]. These constants are fixed by setting x. For the first family, 
for example, by setting x = 0 we have F(0) = F (i.e., x =0 maps onto F = F). In the limit 
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x— 0, K,,(/2px)/L,(/2px) —> oo, hence cı = F. For the second family, we can choose 
x = œ, giving c, = F, where x = œ maps onto F. For the first family, we then have 


209/14 


aa ESCHER O 


(3.283) 





with q, 4 0. By considering the asymptotic limits I,,(./2px)/K,(./2px) —> 0 as x > 0 and 
I,(/2px)/K,(/2px) —> œ as x — œ, we observe that the interval x € (0,00) maps one 
to one onto F e (F, F+20)/q,q,) in this first family. Letting x — oo in the second family 
gives an alternate map as 


209/014 
1+ (41/42) Ip (/2px)/Ky(/2px) 





F(x) =F (3.284) 


where x € (0, 00) now maps one to one onto F € (F — 20y/q,q, F). 
Applying the foregoing theorem, the volatility for the F, process is hence given by 
equation (3.271), which upon inserting the generating function in equation (3.277) gives 


205 
VX®) [atu (2X) + KV 2PX(F))] 


Solving for x = X(F) using either equation (3.283) or equation (3.284) and inserting into 
equation (3.285), we observe that the F-space volatility function generally involves as many 
as six adjustable parameters: 0, 4, P, q1; >, and F. It can be seen from the transformations, 
however, that the effective number of independent parameters reduces to five: 05/9, q), 42/41; 
hH, P, F. 

Further properties of these variable transformations lead to other useful subfamilies of 
volatility models, as follows. As can be seen directly from equation (3.278), the function 
F = F(x) is monotonically increasing, assuming dy > 0. By considering the limit q, > 0 (for 
fixed nonzero q), the first family reduces to a four-parameter subfamily, 


I,(/2px) 
K,,(/2px) i 





o(F) = (3.285) 


F(x) =F+a (3.286) 


where x € (0, 00) maps onto F € (F, œ) (for constant a = 209/35 > 0), with volatility function 





o(F) (3.287) 





~ /XCPYK2(/2pX(F)) 


Similarly, by considering the limit q, — 0 (for fixed nonzero q,), the second family, with F(x) 
defined by equation (3.284), admits another (dual) four-parameter subfamily of solutions with 


K„(v 2px) 


F(x) = F- CE Opa 


(3.288) 


where x € (0, 00) maps onto F € (—oo, F) (for a = 20)/q? > 0), with volatility function 





o(F) (3.289) 





~ VX(P)R(/2pX) 
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3.8.3 


For option-pricing purposes, it is useful to consider the related family [obtained via the — 
branch of equation (3.278) and using equation (3.280) for q, = 0] that maps x € (0, oo) onto 
F € [F, œ) (for a =20)/q; > 0): 


K,,(/2px) 
I,(/2px) 
This family has the same volatility function (3.289) and defines a strictly monotonically 


decreasing function F = F(x). Indeed by differentiating equation (3.290) w.r.t. x and using 
the Wronskian property, we find the derivative 


F(x) =F +a (3.290) 


a/2 
~ xP(SZpx) | 


By combining the generating function (3.277), the F-space volatility function (3.285), and 
the x-space volatility function v(x) = 2./x into our main equation (3.274) of Theorem 3.1, 
we obtain the relationship between a pricing kernel U for the general (dual) six-parameter 
Bessel family and a kernel u for the Bessel process: 


x! [qi1,(V2px) + qK „(V 2px) 
ox? [a 1, (2P) + DK, (/2p%)| 


where x = X(F) and x) = X(Fp) are given by inverting either equation (3.283) for the first 
family or equation (3.284) for the second family of solutions. Here u(x, xo; t) is an x-space 
kernel for the Bessel process, as given in Section 3.7. The particular solution used for u 
depends on what set of boundary conditions we require U to satisfy. For instance, one uses 
either the kernel in equation (3.215), (3.232), (3.245), or (3.253), depending on the specific 
boundary conditions one wishes to impose. We point out that among the general possible 
Bessel families of pricing kernels given by equation (3.292), only a subclass of solutions 
with q, = 0 can provide pricing kernels with no absorption in F-space. This important class 
of solutions is discussed in detail in the next section. For a technical discussion concerning 
the general question of determining whether or not a given kernel represents a transition 
density that conserves probability over a solution domain (i.e., whether or not absorption 
occurs), see Section 3.8.4. It turns out that for nonzero q, the kernel U in equation (3.292) 
always gives rise to probability leakage or absorption at an F-space endpoint, even in the 
case where equation (3.215) is used for the x-space kernel u. Partly because of this property 
and the added flexibility of the parameter space, the full six-parameter Bessel model is a 
good candidate for modeling credit-rating migration and default risk and for pricing under a 
credit setting [ACCZ03]. 


F(x) = (3.291) 





U(F, Fo; t) = e "u(x, Xo; t), (3.292) 


The Four-Parameter Subfamily of Bessel Models 


In this section, we specialize the general Bessel family of solutions and consider a subfamily 
of models containing up to four parameters. The pricing of standard European-style options 
is considered under this model. Moreover, we show that special cases of this four-parameter 
Bessel subfamily correspond to other known exact solutions in the literature, such as the CEV 
(constant-elasticity-of-variance), quadratic, and affine volatility models. 

In particular, let us consider the model mentioned in the previous section, with zero- 
drift function and state-dependent volatility function o(F) given by equation (3.289) and 
where the inverse map x = X(F) is defined uniquely (and generally implicitly) by inverting 
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equation (3.290). As previously seen, this family is obtained by a one-to-one monotonically 
decreasing map of the underlying Bessel process space x € (0, œœ) onto the (asset price) space 
F e (F, oo). Figure 3.12 illustrates this map for a particular choice of model parameters. 
Figure 3.13 gives an illustration of some of the typical local volatility plots obtained within 
this family of models. In this family, a and p are positive parameters, F is an arbitrary 
parameter because it corresponds to a lower bound of the F, process, and u = a —1>0 since 
A > 2 is chosen so as to guarantee probability conservation for the pricing kernel U(F, Fo; t) 
in the case of unrestricted barrier-free motion, with process F, attaining any value in (F, 00). 
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FIGURE 3.12 Plot of F = F(x) using equation (3.290) for a = 0.1, p = 0.01, u = 1.5, F=0. 
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FIGURE 3.13 Local volatility plots of o(F)/F versus F, for four sets of choices of model parameters: 
(p, a, w) = (0.001, 16, 0.25), (0.001, 9, 0.5), (0.001, 1.7, 1.25), (0.01, 150, 1.25). These choices 
correspond to most rapidly increasing to least increasing with fixed local volatility at F = 100 and the 
choice F = 0. 
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In this case, q, = 0, and formula (3.292) reduces to 


2x!-% P(/2px 
uae lt A 


a x5? LuV 2pxo) 





e "u(x, Xo; t), (3.293) 





with x = X(F), X) = X(F,) via equation (3.290) and where u(x, xo; t) is an x-space kernel for 
the Bessel process, as given in the previous sections. This formula hence provides a general 
link between a pricing kernel for the underlying Bessel process and that for the four-parameter 
Bessel family. As such it can be used to generate exact analytical pricing kernels for the case 
of barriers (which are useful for pricing barrier options analytically under the four-parameter 
Bessel model), or we can simply use it to generate barrier-free pricing kernels. 

In this section we focus on the case of barrier-free solutions. Specifically, by inserting 
equation (3.215) into equation (3.293) we obtain the barrier-free analytical pricing kernel for 
the four-parameter family in terms of the modified Bessel function of the first kind: 





U(F, Fy; t) = (3.294) 


e Pt (X(F)+X(Fo))/2t XFIT} (,/2pX(F)) if /X(F)X(F;) 
at VX) t | 


Typical densities are shown in Figure 3.14. As can be observed for the particular choice 
of model parameters, the densities are significantly skewed, particularly for larger values 
of time ż. This pronounced tail feature becomes apparent when comparing the cumula- 
tive densities of a four-parameter model with that of the lognormal model while choosing 
model parameters such that the two transition densities have similar spreads about the spot 
Fy (i.e., the local volatility at Fy is set to the lognormal volatility). Figure 3.15 gives a 
relative comparison of the cumulative densities. Note: Given a transition probability den- 
sity U(F, Fo; t), the cumulative density is defined in the usual manner by ®(F, Fo, t) = 
Jp UF’, Fy; dF’. 
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FIGURE 3.14 Plots of the transition probability density (3.294) for a= 0.1, p = 0.01, u = 1.5, F=0, 
Fy = 14.15. 
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FIGURE 3.15 A relative comparison of cumulative density functions for that of a lognormal transition 
density (linear model) versus that for a typical four-parameter Bessel family kernel. The parameters are 
chosen so that the local volatility ai u , at spot Fọ = 100, equals the lognormal volatility parameter. 





For option-pricing purposes it is useful to consider a change of variables F — x while 
using equation (3.291): 


en Pt (x+x9)/2t 
2t, 211, (/2p%) 2px re 


Xo = X(Fo). As function of x, this form is now simply a product of two Bessel functions 
times a decaying exponential factor. Integral identities for such functions are now useful. For 
instance, using property (3.258), it is easy to verify that the density given by equation (3.294) 
conserves probability over the allowable path space F, € (F, 0): 


Žž | U(F(x), Fy: ) = 


1,(/2px)I, (=) f (3.295) 


fee} ee} dF 
| U(F, Fy; t)dF = f Flu, Fy: t)dx 
F o |dx 


R E i e7 x L) m 
240, (\/2px9) i m/h ( eye 


a4 (3.296) 


e7Pt—xo/2t 


A European-style option with assumed payoff A(F’), given a time to maturity ¢, can then be 
priced as an expectation integral (ignoring a discount factor throughout): 


V(F, t) = [ ” UF, Fy )A(F)dF 


= Pee eo x/2t /xX(Fo) 
= same ¢ I w(/2px)I, (Ae ) acres 


222 CHAPTER 3. Advanced topics in pricing theory 


Notice that expectation integrals are more readily computed by expressing the pay-off in 
terms of the x variable. In this manner the implicit inversion step from x to F is mainly 
avoided. A European call written on the (forward) price Fy, maturing in time z, strike K > F, 
with payoff A(F) = (F — K), can be priced exactly in terms of Bessel integrals: 


eo Pt—X(Fo) /2t 


211,,(/2pX(Fo)) 





C(Fo, K, t) = [e K)f® +a ral (3.297) 


where 


f =f, K, t) = f O iy a (VP©L, (ee Yas, (3.298) 


f =f, K, ù) = f O ig 2px), (= 2) ax (3.299) 


Equation (3.297) is derived by using equation (3.290) within the call pay-off of the expectation 
integral. The corresponding put option price can be derived in similar fashion (see Problem 3). 
These integrals are efficiently computed by numerical routines. Figure 3.16 displays some 
exact numerical call prices by application of equation (3.297). 


3.8.3.1 Recovering the Constant-Elasticity-of-Variance Model 


One way to recover the constant-elasticity-of-variance (CEV) model is to consider the limiting 
case where p — 0 within the foregoing four-parameter Bessel family. For this purpose it is 
convenient to define a parameter 6 > 0 such that u = (20)"!, i.e., A = 07! +2. Using the 
leading-order small-argument properties of the modified Bessel 7, and K,, functions with 
positive order u, we have the limiting form of the map, equation (3.290), as p — 0: 
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FIGURE 3.16 European call prices as functions of spot Fy for various maturities. The parameters 
a= 5.06, p = 0.001, A = 5 (u = 1.5), K = 100 were chosen such that the local volatility at the strike 
is o(K)/K = 0.25. 
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where the constant is defined by C = (a/2)(p/2) “T(w)I'(w+ 1) and I(-) is the gamma 
function. Note: The limiting procedure we are considering is such that p — O while ap™ is 
kept constant; i.e., we set the parameter a = const. x p”. Expressions are further simplified 
by defining a positive constant o) by C = où o=, Using u = (20)~! within the last expression 
in equation (3.300) hence gives the limiting form of the map x — F in terms of gp: 


F(x) = F + (02x) (3.301) 
with inverse 
x= X(F) = 0; (F- F)”, (3.302) 


for any constant F. Taking the same limit p — 0 in equation (3.289) and using equa- 
tion (3.302) gives 


a [T@+ 





PS JX(F) (2pX(F))# 
~ 26) EHD Cp pe, (3.303) 


F(u) 


Now, using the gamma function property T(z + 1) = zI(z), (w+ D/Tw) = u = 1/26, and 
the volatility function for this model then reduces to the expression 


o(F) = = (F— Fy, (3.304) 
The exact barrier-free pricing kernel for the CEV volatility model (3.304) is then obtained 


by taking the same limit p — 0 and using the small-argument leading order of the Bessel Z, 
in equation (3.294): 





U(F, Fy; 1) ~ eT KPI XF0))/24 (X(F))!+ 3 (X(Fy))72 i ( mx) 


at (p/2)*[T(u+1)P * t 


This expression is further reduced by making use of the map (3.302), substituting u = 
(20)~', using the earlier definition (a/2)(p/2)“T(w)T(u + 1) = op a and the property 
(w+ 1)/T(e) = 5 Again we arrive at the barrier-free pricing kernel for the CEV model 
with volatility given by equation (3.304), and zero-drift function: 





0 (F - F)? —((F—F)-?8 —F)-2) 262 
of (F—F)2 
7 = \—0 
((F=F)(Fy=F)) 
xT ( S , (3.305) 
0 


where F, Fy € (F, 00). Itis important to point out that this result can also be obtained independent 
of any consideration of the more general four-parameter Bessel family of solutions. In particular, 
this pricing kernel can be derived using equation (3.259) of Lemma 3.1, where the CEV process 
is directly mapped onto the underlying x-space Bessel process (see Problem 4). Solution (3.305) 
can also be extended to the case of a linear deterministic drift term (see Problem 5). 


7The CEV model is usually defined with volatility function o(F) = 6(F — F)!+®. This simply corresponds to 
setting 0) = 66 in all our formulas. 
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Note that this result was derived in the case 0 > 0, for which the lower bound of the process, 
F, =F, is not attained. From equation (3.296) or by use of equation (3.357), the density is 
easily shown to integrate to unity (i.e., no absorption occurs and the density also vanishes at 
the endpoint F — F and as F —> oo). By replacing o/90 —> 0/|6| in equation (3. ee and 
considering the kernel defined by equation (3.305) but with the slight modification = > om : 
we obtain solutions for the CEV model for 0 < 0. Indeed one can verify that this ‘nodified 
pricing kernel is a solution. That is, by direct substitution the kernel is shown to satisfy the 
forward and backward Kolmogorov PDE. In the range 0 < 0, however, the properties of 
this pricing kernel are mote subtle. In particular, one can show that the density integrates to 
unity for all values 0 < —}, hence no absorption occurs for 0 € (—oo, —3). The boundary 
conditions for the density can be shown to be vanishing at F — F (i.e., paths do not attain the 
lower endpoint) for all 0 < —1. In contrast, for 0 € (—1, —3) the density becomes singular 
at the lower endpoint, F = F (hence this corresponds to the case where the density has an 
integrable singularity for which paths can also attain the lower endpoint but are not absorbed). 
For the special case of 0 = —}, the formula gives rise to absorption. [Note that for the range 
0 e (—4,0) the assumed pricing kernel is not useful, since it gives rise to a density that has 
a nonintegrable singularity at F = F, except for certain fractional values of 0. For 0 < 0, 
however, another solution that is integrable is obtained by only replacing the order (20)~! by 
—(26)~! in the Bessel function. The latter solution for the density does not integrate to unity 
and hence gives rise to absorption, whereby the lower finite endpoint F is an exit boundary. ] 
The special case of 0 = —1 gives a nonzero constant value at the lower endpoint and recovers 
the Wiener process with reflection at F = F and no absorption on the interval [F, oo), with 
opie 1 —(F-Fo)?/202t es) 
U(F, Fo; t) ae (e +e . (3.306) 


In the limit F > —co this gives back the kernel for the pure Wiener process on the entire 
real line F € (—00, œ), with U = e- F-Fo)"/200' /g,/2 art. 


3.8.3.2 Recovering Quadratic Models 


We have already seen, in Section 3.5.2, that the Wiener process constitutes a useful underlying 
x-space process for generating exact F-space pricing kernels for the quadratic volatility model 
of the form in equation (3.117) with two distinct roots. In fact, in Section 3.5.2 we employed 
the diffusion canonical reduction transformation methodology and thereby generated various 
exact pricing kernels for this quadratic volatility model by specifically mapping the process 
onto the constant-volatility Wiener process. It is now instructive to show that the quadratic 
model with one double root (i.e., one root of order 2) at the lower limit, F, obtains as a 
special case of the four-parameter Bessel family. For this we simply consider the CEV model 
with choice 0 = 1. From equation (3.304) the volatility function is then 


o(F) = 0)(F —F)’. (3.307) 


Using the Bessel function /1 (z) = /2/7z sinh z, equation (3.305) gives the exact barrier-free 
pricing kernel for this model: 


2 (Fo —F) o(F- F)~?+(Fo—F)~ 2) 2031 
oo 2at (F — F)? 


(= FY Fo=F) 
x sinh 5 
g aot 
where F, Fy € (F, 00). This density integrates to unity exactly (see Problem 2). 


U(F, Fo; t) = 








(3.308) 
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FIGURE 3.17 A hierarchy of analytically solvable state-dependent models with examples of their 
corresponding typical local volatility curves. The popular linear (Black-Scholes) model gives only the 
flat-line local volatility shape. 


It is interesting to observe that the foregoing exactly solvable state-dependent multi- 
parameter volatility models form a kind of model hierarchy that can be summarized in a 
flowchart, as depicted in Figure 3.17. At the top are the underlying (x-space) processes that 
are used to generate the various pricing (F-space) models. Most of the models depicted 
are subsets of the Bessel family. However, extensions to other models are also possible 
by means of the techniques presented in this chapter. For example, one can enlarge the 
family of exact pricing kernels by considering the CIR process as an underlying x-space 
process. As seen in Chapter 2, the CIR process has the linear-drift function A(x) = Ap +A, x 
and hence has one extra parameter as compared to the Bessel process. This gives rise to 
the family of confluent hypergeometric functions (e.g., Whittaker and Kummer functions), 
for which the Bessel functions form a special subset, as depicted in Figure 3.17. The so- 
called confluent hypergeometric family can be shown to contain a total of seven adjustable 
parameters. We refer the interested reader to some recent literature on this topic [ACCLO1, 
Lip03]. Other extensions are also possible. The search for new families of analytical solutions 
to complex state-dependent models and their applications to pricing is a topic of current 
and ongoing research in financial mathematics. For recent works on pricing path-dependent 
options using new families of state-dependent volatility models see [CaM04a, CaM04b, 
CaMO05]. 


Problems 


Problem 1. Show that the density in equation (3.305) integrates to unity for all t>0 by a 
change of variables using equation (3.302) and an appropriate Bessel integral identity. Show 
that in the limit tf — 0 the density represents a Dirac delta function. 
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3.8.4 


Problem 2. Show that the density in equation (3.308) integrates to unity for all t > 0. In doing 
so, do not employ any Bessel integral identity. Hint: Change variables to x = o}? (F — F)? 
and rewrite the integral so as to make use of the identity [> ye dy = Jt (£) ta 


Problem 3. Derive the European put option formula analogous to equation (3.297) for the 
four-parameter Bessel model. 


Problem 4. Let A(x) = à = 07! +2, v(x) = 2,/x, o(F) = 2(F — F)’. Using relation (3.257) 
with the choice p = 0 [i.e., a(x, F) = 0], show that the mapping in equation (3.302) obtains. 
By substituting the kernel u(X(F), X(Fo); t) of equation (3.215) into equation (3.259) of 
Lemma 3.1, arrive at the kernel in equation (3.305). 


Problem 5. Show that the kernel defined by U, (F, Fo; t) =e" U(e™'F, Fo; T(t)), where U 
solves the CEV process dF, = 6F!+*dW,, [i.e. as in equation (3.305) with F = 0, o = 56], 
is a solution to the corresponding CEV process with an added drift function: dF, = wF,dt+ 
8F!+°dW,, for arbitrary drift parameter u. In doing so, arrive at T(t) = (e?' — 1)/20u and 
hence derive the barrier-free kernel 








1 3 
2 F2 F-73729 ont/2 F??? 02H 6t + F~? 
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Conditions for Absorption, or Probability Conservation 


Consider an x-space kernel solving equation (3.256) and a given fixed interval x € (a, b). 
Given an initial interior point x) € (a, b) at time t = 0, the probability p(a, b|xo, t) that a 
sample path x,, 0 < T < t, will have terminal value x, = x € (a, b) within the fixed interval 
at time ¢ > 0 is then 


b 
D(a, bl xo, t) = f u(x, Xo; t)dx. (3.310) 


The rate of absorption into the interval, or the rate of probability increase, denoted by 
r(a, b|xo, t), is then given by dp/dt. Taking the time derivative inside the integral while 
making use of the forward equation (3.256) and integrating gives 

x=b 


r(a, b|xo, t) = E COLE Xo; D) — A(x)u(x, xo; J ; (3.311) 


x=a 
If r(a, b|xo, t) = 0 for any ft, then no absorption occurs over time; otherwise absorption occurs 
inside (or outside) the interval x € (a, b). 

Of interest is whether kernels with imposed homogeneous-(zero-)-boundary conditions 
give rise to absorption or not. In this case we take a = x, and b = xy as, respectively, the lower 
and upper endpoints of the entire solution space and generally assume solutions such that 
A(x)u(x, xo; t) > 0 at both endpoints.® This is certainly the case for all the x-space kernels 
considered throughout this chapter, as can be verified. It hence follows from equation (3.311) 
that the kernel gives no absorption if 


lim 2 (vou, Xo; D) =0 (3.312) 


X>XL 


8 Depending on the solution interval, a lower (upper) endpoint x, (xy) takes on either a finite value or —oo (00). 


3.8 New Families of Analytical Pricing Formulas 227 


and 


lim : (ute Xo; n) =0, (3.313) 


X>Xq OX 


Moreover, note that (regardless of whether u is a barrier-free kernel or a kernel with absorption 
at a barrier for t > 0) any kernel u integrates to unity in the limit t > 0 because of the 
imposed delta function initial condition: u(x, xo; t) > (x — xo) as t > 0. The no-absorption 
conditions (3.312) and (3.313), if satisfied, ensure that p(x, x|Xp, t) is constant as a function 
of t and therefore that conservation of probability is satisfied, with kernel u integrating to 
unity for all t > 0. 

For the Bessel process v(x) = 2,/x, hence, equation (3.311) simplifies to give the absorp- 
tion rate r(x, Xy|Xo, t) proportional to 


du(x, Xo; t) 





R ðu(x, Xo; t) 
lim x lim x ; 
XOX xX xx} Ox 


(3.314) 
Using the barrier-free kernel given by equation (3.215) [i.e., x, =0, xy = œ, x € (0, c0)] while 
making use of the asymptotic properties of the /,,(z) function for argument z —> 0 and z > ov, 
it is readily shown that these limits are both zero, hence giving no absorption. This barrier- 
free kernel therefore conserves probability. Alternatively, this is readily shown by direct 
integration; see Section 3.7.1. In contrast, the kernels given by equations (3.232), (3.245), 
and (3.253) for the double- and single-barrier Bessel process are all readily proven to lead 
to absorption. Considering the double-barrier solution equation (3.232), for example, the 
absorption rate due to either endpoint involves terms of the form x2+!¢/ (x) and x2+',,(x), 
with x —> x; and x —> xy. The eigenfunctions evaluated at the endpoints obviously give 
zero, by design „(x = x,) = 6, (x = xy) = 0. However, the derivative of the eigenfunctions 
p(x =x,), $, (x = xq) are nonzero. Similar arguments can be used to show that the other 
Bessel barrier solutions also give rise to absorption; i.e., probability is not conserved as paths 
attaining either finite barrier level, x, > 0 or xy > 0, are absorbed. 

Now consider any F-space kernel U that is generated from an underlying x-space kernel 
u as given by equation (3.274) and a fixed interval F € (F, F,), F, = F(a), F, = F(b), where 
x € (a,b) maps one to one onto F € (F,, F,). Given an initial point Fy € (F,, F,) at time 
t = 0, the probability P(F,, F,|Fo, t) that a sample path F,, 0 < 7 < t, will have terminal 
value F, = F € (F,, F,) at time t > 0 is then, in analogy with equation (3.310), 





Fp 
P(F,, F,|Fo. t) = Í U(F, Fo; dF. (3.315) 
The rate of absorption into the interval, denoted by R(F,, F,| Fo, £), is given by dP/dt. [Note: 
The absorption rate outside the interval is then just —R.] Again, taking the time derivative 
inside the integral and now using the forward equation (3.258) gives 


F=F, 


R(F,, F,|Fo, t) = se (anvur Fy; D) (3.316) 





F=F, 


a 


In carrying out further analysis, it is convenient simply to transform to x-space variables. In 
particular, using equation (3.274), and the chain rule, 


x=b 


, (3.317) 


x=a 





err v(x) ð A f 
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Xo = X(Fo), a = X(F,), b= X(F,), where |0X(F)/dF| = v(x)/o(F(x)) is used. By mapping 
x €[x,, xy] onto F € [L, H], with endpoints? F(x,) = L and F(x,) =H (x, = X(L), xy = 
X(H)), and letting a= xz, b = xy, then from equation (3.317) the no-absorption condition 
for the interval F € (L, H) can be written generally as 

xX=Xy 


v(x) ô NE l 7 
È TOE CONE TOAN D) —0. (3.318) 





In analogy with the x-space kernel, this condition, if satisfied, therefore represents probability 
conservation with total unit probability on the entire interval of the F-space solution, since 
the kernel U also integrates to unity in the limit t > 0; i.e., U(F, Fo; t) > 6(F — Fy) as t > 0. 

The general condition given by equation (3.318) can hence be used to determine whether 
absorption arises for any F-space kernel obtained via Theorem 3.1. We now apply this 
condition to the general Bessel family. In particular, using equations (3.277) and (3.285), the 
general Bessel family of pricing kernels given by equation (3.292) then admits a no-absorption 
condition in the form 


| (Sito) [Po Hat, V + K 6/209) Jul a D 


Ou(X, Xo; t) w 


=0. (331 
= 0. (3.319) 


> x(x, p) 
X=Xy 
From our analysis on the x-space kernels we readily observe that all single- and double-barrier 
solutions with u(x, Xo; t) = u(xq, Xo; t) = 0 for finite x,, Xy > 0 lead to absorption. This is 
the case since the Lo Ky, Ii and K i functions are finite at finite nonzero endpoints, hence 
for the barrier kernels the foregoing condition reduces to 


X=xXH 


Ou(x, Xo; t) 


=0. (3.320) 
Ox 


xu(x; p) 
XES; 

However, as just seen, this condition cannot generally be satisfied for any of the barrier 
kernels given by equations (3.232), (3.245), or (3.253). We therefore conclude that the only 
possible families of F-space kernels that can lead to no absorption are those with u(x, xo; t) 
given by equation (3.215), i.e. the barrier-free solutions on x € (0, oo) with x, =0, xy = œ. 
We therefore further specialize our analysis exclusively to families of barrier-free solutions 
with underlying barrier-free x-space kernel chosen for u. Upon substituting equation (3.215) 
into equation (3.319), it readily follows that the first term in equation (3.319) is zero in the 
limits x —> 0, oo. Indeed, for the lower limit x — 0 this is a consequence of the asymptotic 
identities: I,,(z) > cız“, K,,(z) > coz", as z — 0, where c}, c, are positive constants 
dependent on the order u > 0. For the upper limit the asymptotic properties Z, (z) > e*// 272, 
K,,(z) > /7/2ze~, as z — œ, are used. The exponential factor e~*/*" in u is hence more 
rapidly decreasing, and the term wu vanishes in the limit x —> oo. Using equation (3.277), the 

no-absorption condition is then reduced to 


x=00 


Ou(x, Xo; t) 
Ox 
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—X9/2t 
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0 
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°Note: The arguments follow in exactly the same way whether a monotonically increasing or decreasing map 
F = F(x) is assumed. 


3.8.5 
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The foregoing asymptotic properties for the /,, K„ functions give zero for the upper limit 
x — oo, for all choices of parameters g,,qg,. On the other hand, evaluating the lower limit 
while using the small-argument expressions for 7, and K,, gives, to leading order: 


e7%0/2t 


ð 

perl a (Cix + Cq x' a (#27) > Aq, (3.321) 
where C,, Cy, A are positive constants and A depends on ¢, p, x9, and u. Hence, we conclude 
that if q, #0, then there is a nonzero finite rate of absorption (i.e., absorbed outside of 
the solution region) at the lower boundary; otherwise for g, = 0 there is no absorption, and 
probability is conserved for all time. The latter is the case of the barrier-free four-parameter 
subfamily kernel as given by equation (3.294). Notice that this conclusion is indeed consistent 
with equation (3.296). 


Barrier Pricing Formulas for Multiparameter Volatility Models 


In concluding this chapter we give a brief discussion of how pricing kernels and European 
option formulas can be obtained in analytically closed form for multiparameter state-dependent 
models and in particular for the Bessel family of models. 

Let us assume we have solved for an underlying x-space barrier kernel in the form of 
an exact eigenfunction expansion given by equation (3.202) for a domain x, x9 € (x, Xy) 
with zero-boundary conditions at the endpoints of the domain. Consider any F, process that 
is mapped onto an underlying x,-process and thereby satisfying the general assumptions of 
Theorem 3.1. From the discussion in Section 3.8.1 it follows that reduction transformation 
formula (3.274) can be used together with equation (3.202) to obtain a general family of 
exact eigenfunction expansions for an F-space pricing kernel that takes the generic form 








U(F, Fy; 1) = a “oD Lig we EDADA): (3.322) 


The generating function à solves equation (3.272) and is used to obtain x) = X(Fo), x = X(F) 
by inverting equation (3.273), where one uses either appropriate branch of the map F = F(x) 
(e.g., monotonically increasing or decreasing) and the volatility function o(F) for the F, 
process is given by equation (3.271). The eigenfunctions ¢, solve equation (3.198). The 
x-space endpoints are mapped onto the corresponding barrier levels in F-space: H = F(x,) 
and L = F(x;,). 

By specializing to the four-parameter subfamily of Bessel models of Section 3.8.3, the 
pricing kernels are given by relation (3.293), where u is taken to be the Bessel kernel as in 
either equation (3.232), (3.245), or (3.253), depending on whether we are seeking an F-space 
pricing kernel U for a double barrier or a single barrier, respectively. For instance, in the case 
of a double barrier with absorption of paths F, at levels L and H, we insert equation (3.232) 
into equation (3.293) to obtain the pricing kernel as a closed-form eigenfunction series 
solution: 


2 


UP? (F, Fy, L, H; t) = G) la (v 2px) = “ENE etal (x), (x0), (3.323) 


I L, (/2Px0) 2PXo) n=l 


with x = X(F), x) = X(Fp) given by inverting equation (3.290). Note that this barrier kernel 
is a special case of equation (3.322). The mapping in equation (3.290) provides us with the 
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unique condition used to fix the two barrier levels L, H by appropriate choice of x-space 
endpoints: 


= Pak KV 2px) 2px,) = s K,(/2pxu) 2pxy) 


T Ag 2px) E Ag 2pxg) 


Or alternatively, given L and H values, these equations are uniquely inverted to give the 
endpoint values x, = X(L) and xy = X(H). The eigenfunctions in equation (3.230) [and the 
eigenvalues satisfying equation (3.221)] are then given uniquely for all n > 1. Note that since 
the mapping F(x) is decreasing, F, F, € [H, L], so the lower barrier is at H and the upper 
barrier is at L in our present notation. That is, the lower (upper) x-space endpoints are mapped 
to the upper (lower) barrier levels in F-space. 

Similar formulas for the pricing kernel also follow for the single-barrier cases. Cumulative 
probability densities can also be computed in analytically closed form. These are in turn used 
to provide closed-form pricing formulas for European barrier calls and puts. For the case of 
the double barrier with L, H > F, we define the cumulative density for H < F < Las 


F 
D(F, Fy t)= f UC, Fy, L, H; df. (3.325) 


Using equation (3.323) and changing integration variables from f to x= X(f) via the 
mapping (3.290) we obtain 


L orei Pn(X(Fo)) (7 
©(F, Fn = De (oenl) LAS L i. 1, (/2px)b,(x)dx. (3.326) 


Using equation (3.230) for ġ„(x) and making a simple change of variables, one can then use 
integral indentities (3.362) and (3.363) to evaluate the resulting integrals. After collecting 
terms and simplifying with the use of the Wronskian identity (3.384) we arrive at the closed- 
form series 


®,(F, Fy, t) = Sena e PHD (X(Fy))W, (X(F)). (3.327) 


LAV AT )n= 


where we have defined the functions 
1) = E (8) — V2 lV 2H) A) 
- A) (3.328) 
and 
b) = M r Fe Bi VTD ~ Juels) e3 


The normalization factor NV, is given by equation (3.231). In a similar manner, the related 
cumulative density given by 


(F, Fo, t) = a U8 (f, Fo, L, H; t) fdf, (3.330) 
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for H < F < L, can also be evaluated analytically. Again using equation (3.323) and changing 
integration variables from f to x = X(f) via the mapping (3.290), 


= = i X(F, 
®,(F, Fo, t) z F®,(F, Ey; tht+a>- e7 P+lenl)t Gn ( ( 0) 


n=1 L(y 2pX(Fo)) 
X(H) 
x J ? K,,(/2px),(x)dx. (3.331) 


This last integral is evaluated using equation (3.230) for @,,(x); and, after changing variables, 
we use the integral indentities equations (3.364) and (3.365). Collecting terms, using (3.384), 
and simplifying we obtain the closed-form series 


ee ` eT PHE) : X F, , X(F)), (3.332 
LOKT) Z ,(X(Fo))Pn,p(X(F)), (8-332) 


È, (F, Fy, t) = F®,(F, Fy, t) + 


where 


Oe Al DPRK EETA EE E AEI A 


> ZNK, (2PX | (3.333) 


Under the four-parameter Bessel family of volatility models, a European double-knockout 
call maturing in time ¢ with payoff (F — K), therefore has value given by (excluding dis- 
counting) 


CP? (Fy, K, t) = È, (L, Fy, t) — D, (K, Fo, t) — K[®,(L, Fy, t)— ®.(K, Fy, t)] (8.334) 
for H < K < L and 
CPP (F), K, t) = È, (L, Fo, t) — KO, (L, Fo, t) (3.335) 


for strike values below the barriers, K < H < L. An analogous formula for the put option is 
also readily obtained. Analogous formulas for the option values for single barriers can also 
be derived in similar fashion. By applying a similar limiting procedure to the one discussed 
in Section 3.8.3.1, the foregoing families of formulas can also be used to recover closed-form 
formulas for barrier pricing kernels (as well as barrier call and put option values) for the 
CEV model with zero drift function. In particular, one can recover the double-barrier kernel 
for the CEV model (see Problem 1). Moreover, as a special case of the CEV solutions, even 
simpler closed-form expressions for the barrier kernels and barrier option values arise for the 
quadratic model of Section 3.8.3.2. As already discussed, in this case u = 1 and the modified 
Bessel functions are just the elementary hyperbolic sine and exponential functions, while the 
ordinary Bessel functions are just the sine and cosine functions (see Problem 2). 


Problems 


Problem 1. By using a similar limiting procedure to the one in Section 3.8.3.1 for the 
barrier-free case, with u = 1/(20), obtain an exact eigenfunction series expansion for the 
double-barrier pricing kernel for the CEV model with volatility given by equation (3.304) 
and zero drift function in the F, process. Express your answer explicitly in terms of F, Fo, 
L, H, do, 0, and time t. Also, provide the equation for the eigenvalues. 
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Problem 2. Obtain an exact eigenfunction series expansion for the double-barrier pricing 
kernel for the quadratic model of Section 3.8.3.2. This can be achieved by specializing the 
CEV formula from Problem 1 using 0 = 1. Another (simpler) way (which makes no use of the 
CEV result of Problem 1) is to set u = 1 in the four-parameter Bessel family map (3.290) and 
in equation (3.323). Then by letting a = C,/p (for an appropriate choice of constant C), take 
the limit p — 0 of equation (3.323). The series should in fact reduce to elementary functions, 
with a simple exact expression for the eigenvalues e,. Express your answer explicitly in 
terms of F, Fy, L, H, o 9, and t. Hint: For half-integer order the Bessel functions are 


Ji (z) = /2/7z sin z, Yı (z) = —./2/7z cosz, L (z) = /2/7z sinh z, Ki (z) =s m/2ze™. 
Problem 3. Derive a closed-form series expression for a double-barrier call option price for 
the quadratic model of Section 3.8.3.2. You may use the result in Problem 2. 


Problem 4. Following the same limiting procedure as in Problem 1, obtain an exact eigen- 
function series expansion for the price of a double-barrier call option for the CEV model; 
i.e., obtain the analogues of equations (3.334) and (3.335) for the CEV model, with zero drift 
function in the F, process. 


3.9 Appendix A: Proof of Lemma 3.1 
Assume a relationship among the fundamental solutions in the form 


vO) a (3) 
AF) PC) 


with a, (x) to be determined. By direct substitution of this Ansatz into equation (3.258), 
applying the chain rule of differentiation and collecting terms gives 


U(F, Fo; t) = 








u(x, xo; t), (3.336) 








ðu 1 u 1[(”’¢),+v(r¢), , ðu 
g TOUS <] $ +v0'(F) | 
t j es Au B33 


where primes and subscript variables denote derivatves with respect to the appropriate variable 
and function arguments and u = u(x, Xo; t). Rewriting equation (3.256) by explicitly carrying 
out the derivatives gives 

ðu l, u 


ô 
= -p — 
ot 2 ax? 0 


+ (2vv, — à) a + (2 +0v,,—A,)u. (3.338) 
x 


Combining the last two equations gives a linear equation in u and u,. Since this equation 
must be valid for arbitrary solution u, the coefficients in u and u, must be zero identically. 
Setting the coefficient in u, to zero gives a first-order equation that can be cast in the form 


ly(x)-o'(F) A(x) 

2 v(x) v(x)? 

x ld iog v(x) A(x) 
2dx ° o(F(x)) v(x)?" 





< toe 6(x) = 





(3.339) 


Here we used F’(x) = a/v. 


3.10 


3.10 Appendix B: Alternative “Proof” of Theorem 3.1 233 


Integrating from an arbitrary point x9 to x, we find 


(x) = 1 v(x) /o(F(x)) * A(s) 
o G ae. as). (3.340) 


Setting the coefficient in u to zero and using equation (3.339) gives a second-order equation 


in œ: 








yes + o —o'(F)’) +0(F)o"(F)—vv,,— (3v, + (F) +2A,-—2a=0. (3.341) 


For a solution to exist, this equation must be consistent with equation (3.339). Hence, by 
differentiating equation (3.339) once with respect to x while using equation (3.339) in the 
resulting expression, we obtain 





PE rg. oP E) O E) 
F mtot -a (3.342) 


Inserting the value of v?¢,,/¢ in this equation into the previous one and simplifying finally 
leads to an expression for a: 





1 A? Vy 1 n 1 2 / 2 
a=z>|à + 2A— AOE oF) — va) +70 = o (F)) 
2 y? v 2 4 


This is equation (3.257) and must be a constant, as assumed throughout the derivations. 
Hence, combining equations (3.336), (3.340), and (3.257), we conclude that U given by 
equation (3.259) indeed solves equation (3.258). Moreover, the Dirac delta function initial 
condition in F-space is also satisfied, since 


v(X(F)) 


lim U(F, F}; t) = 
en ( 04) o(F) 


8(X(F) — X(Fy)) = 8(F — Fy), (3.343) 


where X'(F) = v(X(F))/o(F). 


Appendix B: Alternative “Proof” of Theorem 3.1 


Here we show how Theorem 3.1 arises as an application of the (continuous-time) fundamental 
theorem of asset pricing presented in Chapter 1. The argument can be formulated by making 
reference to a financial model. Consider a multicurrency financial model where domestic 
interest rates are zero, the process x, is interpreted as a price process for an asset denominated 
in a foreign currency, and F, = F(x,) is the price process for a contingent claim (a quanto 
option) in the domestic currency. Assume that under the pricing measure where F, has zero 
drift function, the underlying foreign price process x, obeys the equation 


dx, = p(x,)dt + v(x,)dW, (3.344) 


for some drift function u(x). Assume also that the volatility v(x) is such that, for some choice 
of the drift function A(x), one can solve stochastic differential equation (3.254). By solving, 
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we mean that it is possible to find the pricing kernel u(x, t, x9), which can be interpreted as 
the current time-zero price of an infinitesimally narrow butterfly spread option of maturity 
time t, i.e., with delta function payoff 6(x — xo), where xy is the spot price of the underlying 
foreign asset. 

Our objective is to show that if the volatility function for the quanto option F, is defined 
as [i.e., equation (3.271)] 





a (XE) AG) 
oX A 


û(X(F), p)? 


then it is possible to find the pricing kernel for the quanto option F, (which will be in 
analytically closed form assuming the kernel for the x,-process is given analytically). Here, 
p is a real valued parameter and the function ù = u(x, p) is defined as the solution of 
equation (3.272), i.e., 


o(F) 





(3.345) 


x 


p 
pu = pù — Àù,. (3.346) 
Finally, the function X(F) in equation (3.345) and its inverse, F(x), are defined as the 


solutions of the equation 


dX(F) v(x) 
dF — o(F) 








(3.347) 


The key in this derivation involves a change of numeriare asset given by a process g,, 
defined as 





e” 
8 = 57? (3.348) 
u(x;,P) 
and by applying Ito’s lemma to this function of x, and t we have the SDE 
x K p] in 
uy 2 uy 1 Uyy 
dg = (r-et (‘) -.— |) sacar dW,, (3.349) 
u u 2 u 
where the lognormal volatility of g, (denoted by g£) is given by 
EENES (3.350) 
ù 
Substituting equation (3.346), we find that 
d —À 
2S is (Ao (oy) dt+o% dW,. (3.351) 
g v 


To demonstrate that g, defines a domestic asset price process, consider this equation in the 
original pricing measure, where the domestic quanto option price process F, has zero drift 
function. In this case, using Itô’s lemma on the inverse mapping x, = X(F,), we arrive at an 
SDE of the form of equation (3.344), with drift given by 


o(Fy d dX(F)__ o(FY d v(x) 
2 dF dF 2 dFo(F)’ 





B(x) = (3.352) 
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where F = F(x). Using the chain rule for differentiation and expressing all functions in terms 


of x, we then have 





ovd(v v v 
ie 2 “(2)=$[. Za], 


(3.353) 


where ø = o(F(x)) is the volatility function for the quanto option of price F,. Hence, by 
substituting into the expression for the risk-neutral drift of g, in equation (3.351) we find 





u—À 5 vo, 1 Bit [aV 
—— o? + (o°) =| A+ vv, |= +r | >]. 
v 20o 2 u u 
Using expression (3.345) for the volatility of the quanto option F,, we find that 


oO, v, 2A 2m, 


x Tx 





ers 


o v w ù 


(3.354) 


(3.355) 


Substituting into equation (3.354), we find that the drift of g, under the pricing measure 
vanishes, as it ought to for a domestic asset. Hence, g, can be interpreted as the process for 


a numeraire asset. 


Next, consider equation (3.351) again, but now under the measure having g, as numeraire. 
Under this pricing measure the price of risk is 0%, hence the lognormal drift is just (o*)*, and 


dg, = (o°)*g, dt+o%g, dW,. 


(3.356) 


Comparison with equation (3.351) shows that under this measure the drift u of the underlying 
process x, is A, as stated. This implies that the pricing kernel for the quanto option, of 
volatility given by equation (3.345), is given by equation (3.274) with equation (3.276), as 


required. 
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Integral relations:'° 


eB 





L x’ eT (2B./x)dx = B” 


qt? 


o0 eP +r )/a 
[T AL OBSDL OY dx= (e), 


Q 





œ e P+) /a 
JE IBSIN dx= E (22) 


a 
In these integrals, the order of the Bessel functions is such that Re v > —1. 


X 
a — b2 





f xJ,(ax)J,(bx)dx = [osn (ax) J, (bx) — bJ, (a ®) Ing, (6) 


X 


/ xJ,(ax)Y, (bx)dx = Poe [osav Y,,,, (bx) — aJ,, (ax) ro) À 





10 Indefinite integrals are given within an arbitrary constant. 


(3.357) 


(3.358) 


(3.359) 


(3.360) 


(3.361) 


236 CHAPTER 3. Advanced topics in pricing theory 


b # a in equations (3.360) and (3.361). 





f xJ,(ax)I,(bx)dx = J, (ax), (bx) + bS,(ax)1,4, (on), (3.362) 





f xl,(ax)Y,(bx)dx = 





f xY, (ax)K,(bx)dx = 


a (ax) Y,,,(bx) + al,,,(ax)Y, œ|, (3.363) 
az + b2 sak 


Y,,,, (ax) K, (bx) — bY, Ka, (3.364) 


/ xJ,(ax)K,(bx)dx = are [astako = bi,(ax)K,.(b9)), (3.365) 
f xJ?(ax)dx = +] 2x) - J, (ax)J „la|, (3.366) 
/ x¥?(ax)dx= S| a—7 i (ax)Y, pla], (3.367) 
/ xJ,(ax)Y,(ax)dx = - [2s ara) -— J, (ax)Y, (ax) 
= Jaadla; (3.368) 
f xt Y?(x)dx = Kadu [ror al; (3.369) 
/ 12 (x)dx = — Ee +J nil; (3.370) 
/ PHT (XY, (x)dx = FA OYI a OY, 11100) (3.371) 


The Wronskian W [J,(x), K,(x)] = —1/x leads to other useful indefinite integrals: 








Gi A/D) 
J x[al, (x) + bK,(x)P al,(x)+bK,(x)° b#0 (3.372) 
or equivalently: 
aa _ —(1/a)K,(x) 
J xla FOK, O aK, O f #0. (3.373) 


Analogous integral identities involving the ordinary Bessel {J, Y} pair also obtain from the 
Wronskian W[J,(x), Y,(x)] = 2/7. 
Differential equations: 


1 
Zi (x) + =Z! (x) — (1+ v*/x7)Z, (x) = 0; Z = 1 oR. (3.374) 
X 


y? LA 


Z@+ Zi (x) +- [P)Z, (x) = 05 Z= S Yas (3.375) 
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Recurrence relations: 


25) = Li) + Fai), 
—2K, (x) = K, (x) + Kyi), 
(2v/x)1,(x) = La) — Lyi), 
— (2v/x)K,(x) = K, (x) — K, (x), 
xI (x) = LVI, (x) + xL, (x), 
XK, (x) = £0K, (x) — xK,41(x), 











xZ, (x) = £0Z, (x) $ XZ ya (2), 
(20/2)Z, (x) = Ziyi + Z,a), 





where Z, = J,, Y,. Combining the Wronskian with recurrence relations gives 


y? y’ 


OE m) aO) E. 


Leading-order asymptotic expansions for |z| — œ: 


e 
LE) ~ -> 


J 2TZ 


T 
K,@)~ [Ze 


Jump discontinuities across the complex branch cut z = e!" x > e7'"x, x > 0: 


1,(e'"x) — I, (e7 x) = 2isin mvI, (x), 
K,(e'"x) — K,(e'"x) = -inl I, x) + L,(a)], 
1,(e'"x) + 1, (e7 x) = 2cos TvK, (x). 


Leading order expansions for small argument z > 0 
L( eA y+?) for complex v 4 —1, —2, —3 
AS TERIDA AA p B 


1 Z\7? 
SE) +o”), for v=-1,—2,-3,... 
T1—») >) Se 


K,(2)~ Tub (3) 


Lg) 


—|r| 
+0(22-"), for real v £0 


(3.376) 
(3.377) 
(3.378) 
(3.379) 
(3.380) 
(3.381) 
(3.382) 
(3.383) 


(3.384) 


(3.385) 


(3.386) 


(3.387) 
(3.388) 
(3.389) 


(3.390) 


(3.391) 


(3.392) 
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CHAPTER: 4 


Numerical Methods for Value-at-Risk 


Portfolios of financial assets are exposed to many types of risks, future events that if they 
occurred would result in financial losses. The purpose of risk management is to quantify 
and control these dangers. Value-at-risk (VaR) is a measure of the market risk, the chance 
of a loss in a company’s portfolio caused by unfavorable changes in prices and rates. 
Minimum risk management standards for financial institutions are set and enforced by national 
regulators. The Basel Accord [Bas88], the market risk amendment [Bas96a, Bas96b], and 
the recent update [Bas88] contain the international guidelines implemented by the national 
agencies.! Value-at-risk has become the industry standard for quantifying market risk, partly 
because of its intuitive appeal and, more importantly, because it is endorsed in the Basel 
Accord. 

For a given portfolio, value-at-risk is defined as the maximum loss forecast over a 
specified holding period and within a given confidence level (see Figure 4.1). In other words, 
it is a percentile of the distribution for changes in portfolio value. If AIT is the change in 
portfolio value during the holding period, then value-at-risk is the solution to a nonlinear 
equation: 


P[AII < —VaR] = 1—a, (4.1) 


where a is the confidence level. Another interpretation is that in the long term we expect 
losses exceeding value-at-risk with frequency | — a. For a = 99%, we expect losses exceeding 
value-at-risk 1 out of every 100 days. Regulators require value-at-risk to be computed daily 
with a confidence level of 99% and for a holding period of 10 days. However, since the 
rules allow for value-at-risk for 1 day to be scaled to approximate the risk for 10 days, we 
choose to consider daily holding periods in our examples. There are many review papers 
about value-at-risk simulation: Stambaugh [Sta96] gives a high-level introduction; for more 


1The Basel Accord and related documents are available from the Bank of International Settlements (www.bis.org). 
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value-at-risk 





1 L = 





FIGURE 4.1 The probability that a loss is greater than value-at-risk, the density of the shaded region, 
is equal to 1 — æ. 


in-depth, general, algorithmic, and mathematical discussions, we have a personal preference 
for [Mor96a, Hul00, DP97].? 

In financial markets, risk is caused by uncertainty about the value of an investment in 
the future. The value of a portfolio is a function of a set of risk factors. Risk factor is the 
generic term for a financial variable related to market prices of selected reference securities, 
for example, equity indices, interest rates, foreign exchange rates, and commodity futures 
prices. Market risk is the risk that the value of a portfolio declines as a consequence of 
changes in the risk-factor values. Therefore, to model market risk we need to understand how 
risk factors evolve over time. 

Consistently with the hypothesis of absence of arbitrage, we will assume that the changes 
in risk factors are random. Although historical data is of limited use to predict changes in risk 
factors, it can be used to estimate statistical models to model risk factors and their correlations. 
In our examples, we use stocks as elementary risk factors, although the methodology applies 
to a wide range of financial instruments. 

A simple formula for value-at-risk can be obtained in the case where an n x 1 vector 
of relative changes R in the market risk factors is a multivariate normal random variable 
with mean vector st and covariance matrix C, and if one assumes that the change in port- 
folio value can be approximated by an affine function of the relative changes in the risk 
factors: 


AT E+A’R. (4.2) 


Throughout this chapter we shall use superscript T to denote the transpose. Note: We are 
using AIT to denote the change in portfolio, i.e., AIT = II, — Ip for a time lapse t, whereas 
A in the dot product is the vector of sensitivities w.r.t. the returns (i.e., the delta Greeks of 
the portfolio), as defined later. 


?The Web site www.gloriamundi.org is an excellent source for information and links to papers on value-at-risk. 
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Since AII — & = A’R is a sum of normal random variables, then it is itself normal. The 
distribution is determined by its mean and variance, 


Mans) = ELAN — 2] = E[A’R] = A*E[R] = A” p, (4.3) 
O”an—=) = Tan = ELA” (R — p))?] = AER - a) (R - p)"]A = A'CA. (4.4) 


So AIT is the random normal variable 
AII=zV A'CA+2E+A'p, (4.5) 


where z ~ N(0, 1). Hence, inverting equation (4.1) while using N~!(1 — œ) = —N7!(a) gives 
the value-at-risk 


VaR = N7!(a)V ACA -E — A’ p, (4.6) 


where NT! (-) is the inverse of the standard normal cdf. 

The linear model with normal relative changes has a closed-form solution, but it suffers 
from two serious problems. First, real-world returns have fatter tails than normal distri- 
butions. The model will therefore underestimate the likelihood of extreme returns, which 
as a consequence may lead to inaccurate estimates of value-at-risk. Second, for portfolios 
with derivatives, the change in value is a nonlinear function. The local error in the linear 
approximation will therefore often be unacceptable, a property that is exacerbated by dynamic 
hedging strategies that use the linearization to eliminate risk locally. To compute value-at-risk 
for models that take these difficulties into account is a substantially harder task. 

Let S, be the process for a risk factor. Returns on S, over the time horizon [0, t] can be 
defined either as arithmetic returns 





R-S% AS 
So So 
or as the log-return, 
R, = log S, — log Sy. (4.7) 


Log-returns have the advantage that one can aggregate returns over time by addition. In the 
multivariate case, S, is a vector of prices and returns are taken componentwise. Of course the 
two are closely related. The difference, 


2 3 
a +0( 5] ) 
2\ So So 
is typically negligibly small for estimation purposes, and either type of return can safely be 
approximated by the other. In the examples that follow, we choose log-returns. 

Because the return is dimensionless, i.e., the quantity does not have a unit, return models 
are preferred over models for prices. We consider a model in which the returns, sampled 
at equally spaced points in time, form a sequence {R,}*%, of independent and identically 
distributed random variables. This means that stock prices are discrete time Markov chains 
with an infinite state space [Ros00]. Choosing different distributions gives different models 
in this family. 
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Closing prices, 1997-2001 
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FIGURE 4.2 Daily closing prices for BCE and Canadian Tire from January 1997 to December 2001. 


Visual inspection of historical time series gives clues on the key statistical properties. 
Figure 4.2 shows the daily closing prices over 4 years for two Canadian stocks traded on the 
Toronto Stock Exchange (TSX): Bell Canada Enterprises (BCE) and Canadian Tire (CTRa). 
The scatter plot in Figure 4.3 shows that the daily returns form a cloud of samples around the 
origin in what resembles a multivariate unimodal distribution. The time series can be divided 
into segments with the same time span as the returns in the model {R;}*%,. For each time 
interval, the relative return can be computed as 


po i=l,...,d, (4.8) 
Si- 
where S;_; and S; are, respectively, the prices at the beginning and end of the time inter- 
val. Since the returns {R,}%, in the model are independent and identically distributed, the 
computed (observed) returns r; are viewed, rightly or wrongly, as independent samples from 
the same distribution. After settling on a family of distributions for the random-walk incre- 
ments, the parameters of this distribution can be estimated from the time series of returns 
USE 
Many generalizations of the random-walk model have been proposed to correct short- 
comings revealed in empirical studies; see, for instance, [CLM97]. Over time periods of a 
few days one can make the simplifying assumption that the returns {R;}°°, are independent 
and identically distributed. First, for time periods spanning more than a few years, the returns 
are not identically distributed. To obtain the current reading and forecast for the volatil- 
ity, it is standard practice either to use only recent data or to use a weighting scheme to 
attribute a lesser weight to older data or to model the intertemporal dependencies by means 
of more elaborate statistical models, such as ARCH and GARCH [Eng82, Bol86, Nel91, 
Hul00]. 
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FIGURE 4.3 Scatter plot of relative returns for BCE and Canadian Tire. 


4.1 Risk-Factor Models 


4.1.] 


Recall that, in the random-walk model, returns are modeled as a sequence {R,}9°2, of indepen- 
dent and identically distributed random variables. In this section, we discuss three different 
instances of this model, three different alternatives for the distribution of the random vari- 
ables: the normal random walk, the asymmetric Student’s t-distribution and the nonparametric 
density estimator due to Parzen [Par61]. The methods will be generalized to the multivariate 
case in the next section. 


The Lognormal Model 


In the lognormal model, the distribution of log-returns 
R; ~ N(w, 0°), a A (4.9) 


is normal with mean w and volatility o. The mean can be estimated using the sample returns 


(4.10) 


Qe 
Mea 
= 


ge 


ll 
Me 
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and the variance by 
1 d 
C= qo ei (4.11) 
i=l 


See, for instance, [LM86]. Some authors advocate using estimators that give more weight to 
recent returns than to old ones (see, for example, [Mor96a, Hu100]). 

To illustrate the performance, we estimate the parameters fi and ọ° for daily returns 
for the BCE time series. Figure 4.4 shows the quantile-quantile plot? for the fitted normal 
distribution. It is clear that the normal model is a good approximation for small returns, but 
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FIGURE 4.4 Quantile-quantile plot for the normal random walk with parameters estimated from 
4 years of daily returns for BCE. 


3A quantile-quantile plot is a method for comparing two distributions. Given a set of observations, we use it to 
compare the empirical distribution and a distribution fitted to this data. Sorting the observations gives the empirical 
cumulative distribution functions (cdfs). Each observation, which corresponds to a quantile, and the corresponding 
quantile for the fitted distribution are marked in the plot. If the two distributions are the same, the points fall on 
the diagonal reference line. Deviations from the diagonal line indicate that one distribution has fatter or thinner tails 
with respect to the other. To learn more about this, the reader is referred to the relevant numerical project in Part II. 


4.1.2 
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for both the negative and positive tails the distribution does not fit the data. Fat tails are 
typical for stock returns; to estimate value-at-risk, where we need to compute tail quantiles, 
the normal model is less suitable. The next two subsections explore different approaches to 
construct random-walk models with more realistic tails. 


The Asymmetric Student's t Model 


Student’s t-distributions have fat tails. The density for a t-distributed random variable is 


ew, (5) °. xeR; (4.12) 
r (4) Jor v 





Pr(x; v) = 


the mean is u = 0, and the variance for v > 2 is 





P= . (4.13) 


The normalization factor involves the gamma function T(-). The degrees of freedom v control 
the fatness of the tails; as v —> oo, the distribution converges to the normal distribution. 

An alternative to the normal model is to define a random walk with t-distributed incre- 
ments. Since the fatness of the tails can be different for negative and positive returns, we 
generalize this idea and let each random variable in the sequence {R;}%, be distributed as 


A= nto ja-o(* 2)ar, +o je (42)e-nr (4.14) 


The random variables T, and T_ are t-distributed with degrees of freedom v, and v_, 
respectively. The random variable B is a Bernoulli random variable; B takes the value 0 or 1 
with probability .5. The random variables T_, T,, and B are independent. We say that A is an 
asymmetric Student’s t-distributed random variable. The density, figuratively a density made 
up of a Student’s t pdf cut in half, is 
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a= |: TE E 
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if x <m, 





(4.15) 








if x >m. 





Since the two regions each make up half of the density, m is the median of the distribution, 
and, with a little algebra, it is easy to derive moment properties relative to the median. We 
then have the following result, whose proof is left as an exercise. 


Proposition 4.1. Suppose that v_ > 4 and v, > 4. Then an asymmetric t-distributed random 
variable, defined by equation (4.14), satisfies the following moment properties: 


(i) The expectation is 





n® = E[A—m]=oa 
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(ii) The second moment is 
n® = E[(A—m)’]= 
(iii) The second conditional moments are, for negative values, 
© = E[(A—m)|A < m] = 20°p 
and, for positive values, 


nÊ = E[(A—m)?|A > m] = 20°(1—p). 
(iv) The fourth conditional moments are, for negative values, 


6 
n® = E[(A—m)‘|A < m] = 204 p° [> + | 
v_— 


and, for positive values, 
6 
nÊ = E[(A—m)*|A > m] = 204 (1 — p}? [3+ | 
v,—4 


Once the moment properties are known, estimating the parameters in the model is straight- 
forward. The first step is to compute the median m of the observed returns {r;}¢_, by sorting 
the samples and taking m to be the order-k value if d = 2k+1 is odd, or he average of 
the order-k and-(k + 1) values if d = 2k is even. Then find the sample estimate for the 
second moment 


1 d 
a2 arr 2 
zz d—1 dr, m) 


We then estimate the contribution to the second moment P from the negative and the positive 
halves. Let d = d_+d,, where d_ and d, are the number of observations less than and 
greater than m, respectively. Then 


P= sa Lay, 


rsm 


Finally, using the sample estimates for the fourth moments, 


W == En- my and P= =2 Ee- m), 


ri <m d, ři >m 
we can solve for estimates of the degrees of freedom v, and v_, 


=p +4 and gan Oe ew, 
204p? 24(1-P 

The advantage of the asymmetric t model over the normal model is that, as illustrated 
by the quantile-quantile plot in Figure 4.5, the tails of the empirical distribution can be 
reproduced more accurately. However, this improvement comes at a price, since the pdf has 
a discontinuity at the center. The jump is counterintuitive and the implementation of this 
model is more difficult, but in comparison to the advantage of increased accuracy these are 
minor concerns. 


4.1.3 
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FIGURE 4.5 BCE quantile-quantile plot for the random walk model with the asymmetric t model. 


The Parzen Model 


A nonparametric density estimator is an alternative to using a parametric method, such as 
either of the first two examples. Let {r;}_, be samples from a distribution with an unknown 
pdf, p(x). In [Par61] Parzen develops and analyzes a family of estimates of the form 





d aH 
Pa(x) = nuke 7 i), (4.16) 


initially suggested by Rosenblatt in [Ros56]. In our examples, we use the weighting func- 
tion [TT90] 


15 
K(x) = 16 =<) for |x| < 1. (4.17) 
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Note that K(x) > 0 is a kernel function that integrates to unity. Parzen shows that, if p(x) 


is sufficiently smooth, D(x) is asymptotically unbiased and, for an optimal sequence of 
h-values, the mean square error converges to zero as* 


El (Bq(x) — p(x))?] = O(a" 5). 


We refer to a random walk using the Parzen estimate (4.16) for the pdf as the Parzen model. 

Similar to the asymmetric t model, the Parzen model can recreate the fat tails more 
accurately than the normal model, and it also seems to have a slight advantage over the 
asymmetric t model, as illustrated by the quantile-quantile plot in Figure 4.6. The advantage 
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FIGURE 4.6 BCE quantile-quantile plot for the random-walk model with the Parzen density estimate. 


4Parzen presents a theory for density estimates of the form of equation (4.16), with general weighting functions 
K(x). Let hg — 0 as the number of samples d —> oo. He shows that density estimates of the form of equation (4.16) 
converge (pointwise in a mean square sense) to a continuous pdf as d —> ov, More precisely, given a sequence of 
smoothing parameters {h,}7_, with limy,,. hg = 0 and limy_,,, dhg = œ, 


El(Ba(x) — pa) > 0 as d > oo. 


The sequence of smoothing parameters giving optimal rate of convergence depends on both the point x and the pdf p(x) 
as well as the weighting function K(x). See Parzen [Par61] for examples of and details about general weighting functions. 
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of using a nonparametric model is that it does not rely on specific assumptions about the 
shape of the density. There are three disadvantages to the Parzen model. First, the optimal 
smoothing parameter h is unknown. While experimenting with different stocks, we have 
found that taking h equal to the standard deviation works well.> Second, for our choice of 
weighting function, the density estimate has compact support. However, the support covers 
the region of interest for value-at-risk calculations, so it should have a minor influence on the 
result. Third, evaluating equation (4.16) or the corresponding cumulative distribution function 
(cdf) for different values of x is expensive for large samples. In our implementation, we 
avoid summing over all sample points by using cubic splines to approximate the cdf and 
the pdf. 


Multivariate Models 


So far we have only considered models for the return on a single risk factor. In general, 
portfolios depend on many risk factors. Therefore we must extend the one-dimensional 
random-walk models, presented in the previous sections, to the multivariate case. 

In the multivariate random walk, {R;}%°2, is a sequence of R”-valued vectors of random 
variables. The random vectors are independent and identically distributed. The difficulty in 
constructing a realistic multivariate model is that returns on the risk factors are typically 
dependent, as exemplified by Figure 4.7. To approximate the dependence structure without 
introducing an overly complex model, we restrict our attention to multivariate models where 
the random vectors {R,;}%, satisfy 


R, = A™'X, +b. (4.18) 


Moreover, we assume that the random vector X has independent components and the pdf is 
a product of one-dimensional density functions: 


P(X) = pi (x1) Pa Xn). 


We postpone the discussion about how to choose the linear transformation, i.e., the matrix A 
and the vector b, to Section 4.3, after discussing portfolios of derivatives. 

To find a stochastic process to model stock prices in continuous time is a more difficult 
problem. Returns are often modeled by stochastic differential equations (SDEs). As discussed 
in Chapter 1, Brownian motion is the natural continuous-time generalization of a random 
walk with normal increments. In this model, the return process is a constant-coefficient SDE, 
dR = u dt+o dW,. Like the normal model for stock prices, geometric Brownian motion 
underestimates the likelihood of large returns: It does not have fat tails. 

Many different types of continuous-time models have been proposed and studied in the 
literature, in particular for pricing derivatives. If the returns are a stationary Markov process, 
then, for example, the sequence {r;}“, of historical returns can be used to find an estimate 
for the transition density — the time-dependent probability density p(r, t) representing the 
density for the return r at time t. Figure 4.8 shows the Parzen estimate for the transition 
density for the stock BCE. A good model has a transition density that is close to this estimate. 


5This choice of h may work well in our examples, but it is not a satisfactory solution in general since a fixed 
smoothing parameter does not give convergence as the number of samples d — oo. The estimate converges for 
a sequence of smoothing parameters that decrease to zero as the number of samples increases (see [Par61] for 
details). 
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FIGURE 4.7 Principal components superimposed on the scatter plot for the returns on BCE and CTRa. 
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FIGURE 4.8 Parzen estimate for BCE to the transition density. 
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4.2 Portfolio Models 


In this section we discuss portfolios and introduce a new method for portfolio-dependent 
parameter estimation. The purpose of this section is first to connect the idea of a portfolio 
and the changes in its value to the risk-factor models considered in the previous section. 
The second purpose is to give a detailed presentation of a portfolio-dependent estimation 
procedure and to discuss the related computational issues. 

A portfolio is represented as a k-vector 0, where 0; is the position in the ith security. When 
considering a market without derivatives one assumes the number of securities k and the number 
of risk factors n are the same, k = n. Generally, in a market with derivatives, the number of 
securities is greater than the number of risk factors, k > n. In practice, if 0, represents shares, then 
it is an integer; but in modeling portfolios it is convenient to let 0; be a real number. Furthermore, 
in a market where short selling is allowed, 0; can be either positive or negative. 

Let V’ be the price of the ith security at time +. If the security is a direct investment in 
the risk factor, then V? satisfies the identity 


Vi = Si = Si(1+ Ri). (4.19) 


If, on the other hand, it is a derivative security, then yi is a function of the risk factors. 
Assuming that the portfolio remains unchanged, the dollar value of the portfolio at time f is 


the sum 


k 
m, =% 0;V}. (4.20) 


i=l 
The change in the value from time 0 to time tf is 


k 
ATI, = I, — Tp = X` 6,(V} — Vo). (4.21) 


i=1 


If the Black-Scholes model were correct, the drift would be the only difference between the 
stock price processes for the probability spaces with the risk-neutral versus the real-world 
measures. However, this approach does not reproduce observable prices for traded options. 
Therefore, instead of using a volatility estimated from stock price data, pricing models 
typically use parameters implied by option prices; i.e., the pricing model is used as a form of 
interpolation scheme. In value-at-risk simulation, we are interested in changes in value over 
a short time period, and what is needed is a model that captures the local dynamics of the 
value. Hence, compared to the models used for derivatives trading, where the whole lifespan 
of the contract must be considered, the quality of the pricing model is less critical. In our 
examples we use the Black-Scholes model to construct such local approximations, but the 
ideas could in principle be extended to more complex pricing models. 

Two problems we touch lightly upon in this chapter are volatility risk and mapping 
of risk factors. Both topics are important in the implementation of market-risk models. As 
mentioned, parameters of option-pricing models must be chosen to reproduce prices in the 
market. Unfortunately, parameters such as the volatility ø in the Black-Scholes model tend 
to change over time. Therefore, a natural extension is to make volatility stochastic (see, for 
example, [Wil00]). This makes potential changes in volatility a source of risk, and it can 
be introduced as a risk factor in a value-at-risk model. In Sections 4.5 and 4.7, we study a 
very simple version of such a model, and we see that it leads to some interesting qualitative 
changes to the problem. Mapping of risk factors is the process where some risk factors are 
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4.2.] 


replaced by a few general factors. An example is replacing a continuum of interest rates 
with different maturities by a few representative rates. The dimension-reduction problem in 
Section 4.7 can be viewed as an automatic mapping method. 

As we have seen, the price for a derivative may be a complicated function. Often there 
is no explicit formula and it must be priced using a separate simulation. Furthermore, the 
number of different types of securities in a derivatives portfolio will typically be much larger 
than the number of basic risk factors. In such cases, Taylor’s theorem provides a tool to 
approximate the value of a portfolio by a function with a simple mathematical form. 


A-Approximation 


Taylor approximations are accurate close to the point of expansion if the function is suffi- 
ciently smooth. Under this assumption and using a first-order approximation, we obtain® 


avi k av; 
II, = n+ (DoE) (ag) scored -+r?). 


j=l sj 


The R”-valued vector R, of returns has components Ri = r;=(sj— SÌ) /si , where si are 
initial prices and Sİ = S J= 1,...,n, are time-t prices of the underlying assets. Collecting 
the coefficients, we obtain 


Il, ~ Il, + @r+A’R, (4.22) 


where A’ = (A,,...,A,), 


o=(5 6, a) and A,;= (= 6; z) SÈ (4.23) 


i=1 J 


In finance, such a linear approximation is called a A-approximation. As seen in Chapter 1, 
the A; are often used to hedge a portfolio. By taking a position, for example, by buying the 
risk factor or future contracts, that offsets the derivative, a portfolio’s sensitivity to changes 
in the underlying can be reduced. As a consequence, in risk models for derivative portfolios 
a significant component of the risk will be made up of higher-order effects. 

Consider a call option on a single stock, and suppose the current price of the underlying 
So is equal to the strike price, K = 100, i.e., the option is at-the-money. The derivatives of 
the call option value give’ 


av, 

—=Md,), 4.24 
3s (d,) (4.24) 
Ov, s (o 

— = —-N' (d J —-rKe TI N(d_), 4.25 
a 5 oo, ar rKe (d_) (4.25) 


©Taylor’s theorem gives an explicit formula for the error. For the first-order approximation the error is 





vi vi 1 vi 
error = 6; r)y4 6; Si 6, SSE riri, 
s NO (2 "ards; sis) n 29 dsjs, ° °F # 
where the derivatives are evaluated at some f’ € (0, £) and r; € (0, rj) for j=1,...,. 


The A and © are derived by differentiating the Black-Scholes equation, using d_ = d} —aJ/T — t, as discussed 
in Chapter 1. 


4.2.2 
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log(s/So)+ (ro? /2)(T—1) 








where d} = P , and we set the risk-free rate equal to the return r = 
(s— So)/ So, with s as the spot. Choosing appropriate parameters, 





oV, vV, 
= — = — $ ¥ 59.77 and © x —8.12, 
or Os 


and the A-approximation of the gain of the call option is 


I, © Wy + Ot + Ar © 6.89 — 8.12t+ 59.774. (4.26) 


Figure 4.9 shows the A-approximation (as a function of the stock price s for one day, 
t = 1/250) compared to the Black-Scholes price. It is accurate for small returns but quickly 
deteriorates as |r| increases. Finally, when the return |r| is large enough, the approximation 
is negative or less than (s — Ke~?) ,, and therefore it violates the basic principle of no 
arbitrage. 


AT-Approximation 


For nonlinear portfolios, the A-approximation is useful when the time period considered is 
relatively small. But as Figure 4.9 illustrates, ignoring the curvature for an option portfolio may 
lead to a large truncation error. This feature becomes particularly important if A-hedging is 
used, since the linear term is hedged out, leaving a higher-order residual for the portfolio value. 
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FIGURE 4.9 The price of a European call option in the Black-Scholes model compared to the at-the- 
money A-approximation for the value. 
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The nonlinearity can be better approximated by including more terms from the Taylor 
series. Keeping terms in second order in the returns gives the quadratic approximation® 


i= m+ (D0) 439 (zati) sy 





av} 
Ea Paa E) sisin OPH n= 7) 


l=1 j=1 


or 


1 
I, ~ T1,+@r+A’R,+ 5k rR,. (4.27) 


The vector A and the scalar © are as in equation (4.22), and 





vi 
=(r42 As Sist, E inal 


i=1 


In finance, quadratic approximation (4.27) is called a AT-approximation. Because it approx- 
imates the curvature of the value function, it is a more accurate local approximation to the 
portfolio value than the linear A-approximation. 

We return to the (single risk factor) example with A-approximation (4.26) for a call 
option. In this case, the second derivative for the value is 


PV, _ _N'(d,) 


ðs? so. /T —1 





Then 


eV, 
P= 55'85 ~ 273.6, 
S 





which gives the AT -approximation 


2 2 
TL, ~ Ty) + Or+ Art rs ~ 6.89 —8.12t + 59.77r + 273.65, (4.28) 


where r = (s — So)/So. The approximation compared to the exact values is shown in 
Figure 4.10. In comparison to the A-approximation, the AT -approximation is, not surprisingly, 
much closer to the price in the Black-Scholes model. 


8 The truncation error is 


1 vi PVE j 
error = 5 (zs JA 4 ss 2885, sir |e 


i LJ 





1 avi 
OVi Sisk rir t 0; Sisksh ert rl 
+31 PG iaasa 0) t3 2 s sps, °° EM 


for some t’ € (0, 7) and r; € (0, 7;) for j=1,..., n. 
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FIGURE 4.10 The price of a European call option compared to the at-the-money AT-approximation. 


4.3 Statistical Estimations for AT -Portfolios 


In the algorithm developed later in Section 4.5, we assume that derivative portfolios are 
assumed to be represented as AT -approximations (4.27). The change in value, over the time 
period from time 0 to time f¢, is then 


a 1 
All,=0,—-l, ~ AII,=Or+A7R,+ zR rR,. (4.29) 


The return vector R, on the risk factors is a vector of random variables, and, when AII, is 
viewed as a function of R,, it is a random variable modeling the change in the portfolio’s 
value over the time period t. We assume that the stochastic model for returns is of the type 
presented in Section 4.1. 

The parameters of the risk-factor model are estimated, independent from the pricing 
model, from a time series of historical returns. This choice comes at the cost of ignoring 
any connections, suggested by the theory of arbitrage-free pricing, about the relation of 
the real-world and risk-neutral processes. However, it has the important advantage of more 
flexibility in choosing the underlying model for price changes. Furthermore, as illustrated by 
the Black-Scholes model, the conclusion that risk-neutral and real-world processes have the 
same variance, properties implied by the pricing model, may not hold up to scrutiny. For 
these reasons, we believe that this pragmatic approach is justified. 
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Portfolio Decomposition and Portfolio-Dependent Estimation 


In this section, we present a new method for portfolio-dependent estimation of parameters for 
the risk-factor models. The strategy for parameter estimation builds on the observation that a 
AT-approximation can be decomposed as a sum of one-dimensional quadratic functions. This 
decomposition has a long history in applied mathematics and statistics, and it has been used 
by other authors in the computation of value-at-risk. Still, the multivariate risk-factor model 
we present is fundamentally different, in that the resulting risk-factor models are portfolio 
dependent. 

Suppose that {r,}“, is a time series of returns, where r; is an n-vector with component 
observations ‘iis J= 1,...,. We know, as discussed in Section 4.1, that the mean f can be 
estimated with standard sample estimators (in the case of the asymmetric t model, the mean 
is replaced by the median m). Similarly, the standard sample statistics 





A 1 d SA = 
Cj= Cin — Bd) Re — By), (4.30) 
aa 


can be used to estimate the matrix elements C;; = Cov(Ri, R!) of the covariance matrix. For 


the normal model, jf and C characterize the model completely. For the other two models, we 
explain how to estimate the remaining parameters. 

Recall that, as discussed in Section 4.1.4, we want to approximate the dependence structure 
of the risk-factor returns with a product pdf. The first step to construct such a model is 
to observe that a AT-approximation can be factored by solving the generalized eigenvalue 
problem 


goa ae) 
The matrix A is nonsingular and 
A= diag(A,,...,A,)- 
Let 
X= A7'(R, —#) 


define a new vector of random variables. Then, since E[R, —#] = 0 and C= E[(R, —#) 
(R, — #£)"], we have 


E[X]=0 and E[XX"] =I. (4.32) 
Also, by substituting R, = AX + # we can express Aĵ, in terms of our new variables: 
Af, = (er pl A+ T +X7A7(A+Tp)+ SXTAX. (4.33) 
Defining 


A larna 
Z=0+0 A+ zA TA, (4.34) 


A’=A™(A+T@) (4.35) 


4.3.2 
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gives the AT-approximation to the change in portfolio value: 


A, = 2+) (AX; + am). (4.36) 


i=1 


This is simply a sum of one-dimensional quadratic functions 


A; 
T; = Aix; + pA (4.37) 
We postulate that the joint probability density for X = x is a product pdf, 
P(X) = Pi) +++ Pan), (4.38) 


or equivalently that the components of X are independent. As equation (4.32) shows, the 
components of X are uncorrelated by construction, so the approximation with a product 
pdf extrapolates from this property to a more general assumption. Finally, the remaining 
parameters for the asymmetric t and Parzen models can be estimated for each component X, 
individually. The complete parameter estimation and factorization procedure is summarized 
in Algorithm 1. 





Algorithm 1 
Parameter Estimation and Factorization 


Input: A AT-approximation. A time series {r;}“_, of daily returns. 


Output: Parameter estimates for the market model. A factorization of the 
AT-approximation. 


e Compute estimates for the mean f and the covariance matrix C. 
e Solve the eigenvalue problem 


A=A’'TA, 
C=AA’, 


e Compute 
= AT Po ae 
F=Ot+p A+R Tp, 
A’=A"(A4+T@). 


e For each variable X, estimate the remaining parameters from the transformed 
returns {A7' (r; — ft) }“, using the methods in Section 4.1. 








Testing Independence 


The assumption that the risk-factor returns can be modeled by a product pdf is central to what 
follows in Section 4.5. It is therefore natural to question whether it is a valid assumption. 
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We note that returns tend to be scattered around a central point, as in Figure 4.3. Also, the 
parameter estimation procedure produces uncorrelated random variables for which the sample 
correlation is zero by construction. These two points form the basis for our belief that a 
product pdf is a reasonable model. 

For our two-dimensional example, with returns on BCE and CTRa, it is possible to 
construct a statistical test.? To test the assumption, we use a binomial test, as discussed, 
for example, in [LM86]. As we will show, the experiment suggests that in this case, the 
independence assumption is valid.'° 

Consider a portfolio with at-the-money call options on BCE and CTRa. Using the Black— 
Scholes model to price this portfolio, we get a AT -approximation with 


25.4932 69.4300 0 
reer ae r=| 0 aak 


The portfolio-dependent estimation procedure, with the standard sample estimates and the 
data in Figure 4.3, gives us 


A, = 0.0064, A, =0.1371, 
A, =0.0169, A! = 0.6382. 


We want to test if the two portfolio components 7, and m, are independent. To formulate 
this question as a binomial test, viewing 7, and 7, as random variables, define the events 


A, =mi m > lnb 
A = {m : m > Pal 


where Mn, is the sample mean for 77;. From the time series we estimate the probabilities of 
the events: 


~ _ #samplesin A, _ 476 








ie samples 1006’ 
~ _ # samples in A, 490 
Pee samples — 1006° 


In fact, the possibilities for statistical tests are infinite; see, for example, [Feu93] and the references therein. If 
X and Y are independent, then for any functions h and g 


E[A(X)3(¥)] = E[ACX) JET) 1, 


provided the integrals exist. This equality is taken as the null hypothesis for a statistical test. Given a set of samples, 
it is possible to compare sample estimates for the left- and right-hand sides. By the law of large numbers, the two 
converge for independent random variables. If it is possible to find confidence intervals for the sample estimates, 
then the estimates can be tested to accept or reject the null hypothesis. 

Statistical tests cannot prove independence; they can only reject independence through rejection of the null 
hypothesis. Furthermore, although the same idea could in principle be used in high dimensions, formulating tests 
with sufficient power and obtaining a set of samples large enough makes such tests practically unfeasible. 

10 Whether the independence assumption holds is not the central question. The important question is whether the 
approximation leads to good simulation results. Our experience using different stocks and portfolios is that for four 
years of data, rejections become more common as the scales of the portfolio components become more different. 
However, a difference in scales indicates that the influence of one component dominates the dynamics, making the 
independence a secondary issue. For shorter time series, the rejection rate decreases, and rejections with two years 
of data appear to be rare. 
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FIGURE 4.11 Scatter plot for the empirical returns on the two portfolio components m, and 7r,. The 
event B corresponds to the bottom-left and top-right quarters of the plane. 


Consider the event 
B=(A,NAj)U (A; NA), 


i.e., the event that the pair of returns are both either larger or smaller than their estimates for 
the mean (see Figure 4.11). We then formulate our null hypothesis: The probability P(B) is 
equal to 

4 =PiP2+ (1-3) — Py). 
Estimate the probability of B: 


# samples in B 522 
# samples  1006` 





q= 


Treating each sample as an independent Bernoulli trial and normalizing the random variable 
of the number of successes leads to 


522 — 10064 
/1006q(1 — q) 


which is within both the confidence intervals for 95%, [—1.96, 1.96], and 90%, [—1.64, 1.64]. 
Therefore, the null hypothesis should be accepted (i.e., not rejected) and the portfolio com- 
ponents have passed the independence test. 


= 1.15 
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4.3.3 A Few Implementation Issues 


In the generalized eigenvalue problem (4.31), the Hessian F is symmetric and the covariance 
matrix C is nonnegative definite. Symmetric-definite eigenvalue problems arise in other 
applications. The standard method [GL89] is to compute the Cholesky factorization 


C=U'U, (4.39) 


where U is upper triangular. We know that a matrix A can be factored in the form QU, 
where Q is orthogonal (see, for example, the so-called QR algorithm: [PT VF92]). Then, since 
A=A'TA, 


A=U"(Q’TQ)U. 


To solve this eigenvalue problem requires O(n") floating-point operations. Combined with the 
estimation of the covariance matrix, this gives a total of O(n? + nd) floating-point operations. 

However, if we take advantage of the structure of problem (4.31), it is possible to improve 
slightly on this procedure. Let {r;}@_, be the time series of returns and define the d x n matrix 


ri =p” 
w=] : 
r-a" 


The estimate of covariance matrix (4.30) can be written as 


~ l 
C= — WW. 
d-1 


A variety of estimators for the covariance matrix have been proposed in the finance literature 
(see, for example, [CLM97, Hul00, Wi100]). Many of the estimators are of the form 


C=w'Dw, (4.40) 


where D is a weight matrix.!! 

Depending on whether the number of dates d is greater or smaller than the number of risk 
factors n, we get two cases. Suppose d > n. Rather than explicitly forming the covariance 
matrix C, it is preferable to factor it directly by computing the (QR) factorization 


D2W =QU, (4.41) 


where Q is a d x n matrix with orthonormal columns and U is an x n upper triangular matrix. 
The QR algorithm (4.41) is then applied to the matrix UFU”. The QR factorization takes about 


11 Two examples are the exponentially weighted moving average with 


— Ad 
and the multivariate GARCH(1,1) model [Bol86, CLM97, Hul00] £„ = (1 — a — B) V +ar;r? +6 ÈX„—1, where V is 
the standard estimate for the long-run average volatility, which has 


diag(1,A,..., Ae) 





D= 
1 





, d-n 17’ 
D = diag(a, aB, aB% )+ ios —a—B)I. 
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2nd floating-point operations, compared to approximately n?d + n3/3 for equation (4.39). 
The advantage of the QR algorithm is that it can be shown that the forward error is smaller 
than for the Cholesky factorization.’ p 

On the other hand, if d < n, then the matrix C will be singular. The best approach when 
d < n is to use factorization (4.40) and compute the Schur decomposition 


A =Q" (DW) (D W)"Q. 


This effectively reduces the size of the eigenvalue problem from order n to order d, and 
therefore the whole step requires only O(dn? + d?n+ d°) floating-point operations. We 
conclude that performing computations directly on the time series matrix, rather than forming 
the covariance matrix explicitly, is convenient, effective, and numerically sound. 


4.4 Numerical Methods for AT -Portfolios 


4.4.1 


Singular portfolios have to be estimated by means of straightforward Monte Carlo simulations; 
for AT-portfolios, more methods have been proposed. 


Monte Carlo Methods and Variance Reduction 


Assume a multivariate distribution for the returns r € R”: 





pfr) = 1y7C'p), (4.42) 


1 
Jom? 


where C is the n x n covariance matrix and |C] is its determinant. The plain Monte Carlo 
(MC) method for computing VaR for AI’-portfolios in this case is based on sampling the 
returns r from pë ~ N,(0, C). The precise steps are described in the MC VaR numerical 
project in Part II. The basic steps are summarized as follows. 


e Cholesky factorize the covariance matrix, C = U'U. 

e For each scenario, generate an n x 1 vector y of identically and independently dis- 
tributed normal variates. For each scenario vector compute r = Ufy, and evaluate the 
portfolio variation AV(r), e.g., within the AT-approximation, then AV(r) = AII(r). 

e Sort the returns from the complete simulation in increasing order and evaluate the 
VaR as a percentile. 


Other popular methods are based on importance sampling. The idea is to improve accuracy, 
i.e., reduce the variance of a simulation within the same number of scenarios. For VaR 
calculations the idea is to generate weighted scenario sets that populate the tails of the 
distribution more accurately than the body. In the general theory of the evaluation of integrals, 
importance sampling increases sampling efficiency within certain regions of the integration 
space. There are several ways of implementing importance sampling, with various degrees 
of sophistication, depending on the integral dimensionality, the integrand variability, and so 


12 Sun [Sun92, Hig96] proves an upper bound for the forward error for Cholesky factorization. Sun’s result can be 
adapted for our purposes to the positive-definite matrix WTW by considering a small perturbation W > W + ôW. 
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on. What essentially underlies the approach is the technique of changing probability measure 
from which the scenarios are sampled. [We have seen various examples of the use of changing 
measure for pricing options in previous chapters. ] 

A brief overview of importance sampling as applied to the evaluation of an integral is as 
follows. Suppose we wish to evaluate an integral or expectation of some function 1: R” —> R, 


I= E[h(X)] = Í h(x) f(x)dx, (4.43) 


where X is a random vector in R”. The superscript f denotes, as usual, the fact that the 
expectation is taken w.r.t. a A probability measure or distribution f. If we use plain MC, 
then scenarios X;, j= 1,...,N, are sampled with f as density; i.e., the MC estimator (w.r.t. 
f as density) of I is 


-15a (4.44) 


Alternatively, the integral J can be equivalently recast as an expectation w.r.t. any other 
density g, as long as this density satisfies f(x) > 0 => g(x) > 0, xe R": 


F(X) | = 


1=E® [mon Í. (h(x) w(x) 9(X)dx, (4.45) 


where w(x) = 4 w » isa weight function (also called the Radon—Nikodym derivative or likelihood 


ratio). This factor is introduced by taking g as density in place of the original density f. 





Applying standard MC, with samples X;, j= 1, . . . , N, now drawn from g, gives an estimator 
wrt. g: 
x, i) 
(4.46) 
8(X;) 





By taking expectations in this expression and treating the X; as identically distributed random 
vectors with g as density, the reader can readily show froin equation (4.45) that i is an 
unbiased estimator of J; i.e., E® [Î, ]= Z. Likewise, E” [7 4] = Z. In practice, it is of jnierest 
to compare the difference in the variances Var, (/,) and Var,(/,) of the two estimators I, 
and /,, respectively. Since both estimators have the same mean (equal to 7), it suffices to 
consider the second moments. In particular, 


Var, (/,) = EO [P (X)W (X)] and  Var,Ĝ,) = E'? [h?(X)]. (4.47) 


For arbitrary choices of g, the variance Var, (with importance sampling) may be either larger 
or smaller than Var, (without importance sampling). The goal of a successful implementation 
of importance sampling is to choose a density g that is effective in sampling and thereby 
reducing the variance. In order to implement an effective importance-sampling algorithm 
one should at best attempt to sample in proportion to the integrand h- f. Recall the typical 
situation in option pricing, where the price is given by an expectation integral over a product 
of the risk-neutral transition density and the discounted payoff function. Hence, we can view 
the transition density as playing the role of f and h as the discounted payoff function. 
A plain MC calculation for the option price would proceed by sampling asset price paths 
with transition density as the sampling distribution. This is the basis of the MC basket option 
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pricer numerical project in Part II. In contrast, a more effective importance-sampling MC 
algorithm for option pricing would be to consider a different sampling distribution — one that 
gives more “importance” to the payoff function as well as the risk-neutral density. A good 
choice of density g should be such that a greater percentage of sample paths lies within the 
more significant contributions of the integrand. 

For the purposes of computing VaR, we observe from upcoming equation (4.72) that a 
more efficient importance-sampling procedure should be one in which the chosen sampling 
density generates a substantial number of return scenarios that are in the tails of the distribution 
for the portfolio variation AV. This can be seen from the fact that the cdf of AV is an integral 
over the product of the return distribution and the step function. For VaR calculations, the step 
function is significant only in the left tail of the distribution of AV. One possibile importance- 
sampling MC strategy therefore consists of generating scaled returns by introducing a scale 
factor f, so as to transform the return scenarios r into (,/1 — f,)~'r. In some sense one 
can also think of f, as a stress-testing factor. The factor f, is strictly between 0 and 1. Another 
possibility is to shift the returns by a common vector, i.e., to transform r into r +r. A more 
general approach is to make an affine transformation, i.e., to transform r into Ar + rọ for 
some matrix A. 

To show how to compute the weights in an importance-sampling implementation, let’s 
consider the simple case of scaling for a univariate, standard normal distribution (x). The 
particular technique we now present is readily generalizable into the multivariate case. In 
the one-dimensional case, we are assuming that our original sampling density is the standard 
normal, i.e., f(x) = (x), and that we wish to evaluate an integral of the form 


ref i duak: (4.48) 


where h has significant contributions in the tails of ġ. Sampling from the pdf ¢ itself would 
probably not constitute an optimal importance-sampling strategy. The trick we employ is to 
rewrite the original density as follows: 


1 2 1 2 2 
d(x) = mo = TT e %4-,./qe », (4.49) 


where p and q are positive numbers chosen such that 








1 1 
-+-=1. (4.50) 
Pp q 
By defining the factor f, so that p = = and g = = we have 
(x) = w(x) (x), (4.51) 
where w is the weight function 
w(x) = (1 — f)? exp(—f,x7/2) (4.52) 
and $ is the new sampling density 
z 1 Zoa 
P(x) = e Uy, (4.53) 


Jom f 
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4.4.2 


The original integral is therefore transformed into 
1=| PONA), (4.54) 


which can now be evaluated by sampling from the pdf $ — a normal density with rescaled 
variance (1 — f,)~!. The parameter f, can hence be chosen so as to reduce variance in the 
corresponding MC estimate of /. 


Moment Methods 


Moment methods are approximate analytical methods that propose to estimate portfolio VaR 
by evaluating the first few moments of the distribution of the portfolio variation analytically 
and then matching these moments with a model distribution. The first four moments are of 
particular interest, for they provide us with measures of the mean, variance, skewness, and 
kurtosis. We discuss two methods in this class, one named Cornish—Fisher and the other the 
Johnson method. 

In what follows we shall consider portfolio variations within the delta-gamma approxi- 
mation in the form 


1 
AV =AV(r) = A’r+ zr Tr. (4.55) 


[Note: This is Aĵ, as defined previously but without the theta factor @t, which is trivial to 
include if desired.] We denote u, = E[AV] as the first moment of the distribution of AV and 
Um = E[(AV — u, )”] as mth central moment for m > 2. Assuming the return density is given 
by equation (4.42), a straightforward, though lengthy, calculation gives 


u; = E[AV] = SiC), (4.56) 
w = E[(AV — u) ] = A7CA+ sHt(PCy}, (4.57) 
bs = E[(AV — w,)°] = 3(CA)"T(CA) + tr{(PC)’}, (4.58) 
[a = E[(AV —p,)*] = 12(CA)" (TOYA 4 3tr{(PC)*} + 35, (4.59) 


where tr denotes the matrix trace. For example, the first moment is simple to derive: 


u = E[AV] = A "eel + 5 El r'Tr] = L S Niele 


2 fm 


= P TC; = SHC}, 


ij ji 


where we used E[r] = 0 and Cov(r;, r;) = C;j, since the density pË ~ N,(0, C). The higher 
moments can be cones by a similar procedure and using known identities for integrals of 
products such as pr r and higher products, with k,l = 1,2,3,4. For moments u3, w4 this 
is rather tedious. Alternatively, we can obtain the (noncentral) mth moments (and thereby the 
central moments) by evaluating the mth derivative (at the origin) of the moment-generating 
function (mgf) for the random variable AV. That is, E[(AV)”] = M” (0), where the mgf 
M(u) = Ef[e"4"] is given analytically as derived in the next section. 
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The Cornish—Fisher (CF) method stems from the fact that it is possible to derive explicit 
polynomial asymptotic expansions for standardized quantiles (or percentiles) of a general 
distribution in terms of its (standardized) moments and the quantiles (or percentiles) of the 
standard normal distribution. For a detailed mathematical discussion of the general technique, 
see, for example, [HD68]. For our purposes it suffices to point out the main result of the CF 
expansion. Generally, given a probability distribution g(x) having cumulants!? K jp J=0,..., 
the distribution f(x) generated by the expansion 





fla) =o el) (4.60) 


i=0 


has cumulants K; +€;, 7 =0,..., where Di = di /dx’ defines the jth-order differential oper- 
ator. By truncating this expansion, one can hence obtain approximate analytical formulas for 
the density of a distribution f using only its first few known central moments and the first 
few derivatives of an analytically known distribution function g and its central moments. 
Similarly, we also obtain analytical formulas that relate the quantiles of f with those of g. 
When g is chosen to be the standard normal distribution, then what arises is the known 
Cornish—Fisher formula. 

Given the first four central moments in equations (4.56)-(4.59), computing VaR with the 
CF expansion (to fourth order) is particularly simple, for there are explicit formulas for it. In 
particular, the random variable (AV — u;)/ V/M has a-quantile given by 


1 
— P3Z4(2z2 —5), 


1 
= =i 
=Z +> patch +3 54 Psa (2, -3)- 36 


where p} = [3/ ie > Ps= uy /w5—3, and z, is defined as the a-quantile of the standard 
normal distribution z, = N~! (æ). 

Within the CF approximation, VaR with confidence level œ% (as defined by equa- 
tion (4.1)) is given by 


VaR = Zay M2 — Mı. (4.61) 


Note that within the simpler centered normal distribution approximation for AV (with assumed 
zero gamma matrix), p4 = p4 = 0, SO Z, = Z, and this equation is consistent with equation (4.6). 

In the Johnson method, one seeks to match the first four moments of the pdf of AV to 
the cumulative distribution of the random variable 


f(X) = Asinh((X — y)/6) + £, (4.62) 


where X is a standard normal and é, 6, y, A are model parameters. The pdf of the random 
variable Y = f(X) can be found by means of changing coordinates and is given by 


e —[y+6 sinh ly &)/A)P/2 


ae sinh! ((y — €)/A)] 





PY) = (4.63) 


13The cumulants of a distribution are defined by coefficients in a power series expansion of the logarithm of 
the characteristic function or the mgf. Cumulants are related to the central moments: kK; = M1, K} = M2, K3 = M3, 
K4 = M4 — 33, etc. 
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The expectation integrals E[y"] for the Johnson distribution, for n = 0,1,..., can be 
expressed as follows: 


Ely"]= = [E+ Asinh x)"e Ody. (4.64) 
AJI: —0o 
The integrals 
L,(6, y) = 2 L. sinh” xe 0+®/2dx (4.65) 
ef Vaz I-00 l 


are obtained recursively using the recursion relation 


L1 (8, Y) = HeT O] (8, y — 1/8) 
— e0127 (8, y+1/8)] (4.66) 


and the formula for n = 1: 
1,(8, y) = —e!/® sinh(y/6). (4.67) 


From equations (4.64) and (4.65) we find that 


Ely]=€+Ah, 
Ely] = +2éAl, + 7b, 
Ely] = €E[y’]+ A277, +2EV7L +4, 
Ely*] = Ey J+AG 7, HAL HEAL 4 385, 


where Z, = I,,(5, y) for all n. The moments u? = u? (£, 6, y, A) for the Johnson distribution 
are given by 


mi = Ely), 

m = El(y— uD] = ED] e, 

#3 = El(y— uiy] = Ep’) — 2w Ely] + a? — iu, 

pi = El(y— ui)] = Ejly] 341 Eb’) + 347 ED") — uit — uius. 


These four moments are explicitly functions of the four parameters €, 6, y, A, which are then 
fitted by matching the u7 with the u; in equations (4.56)-(4.59). This results in a nonlinear 
system of four equations, 


by =u] (E, 8, y, A), (4.68) 
by = u3 (È, 8, Y, A), (4.69) 
Hy = u5 (#, 5, Y, À), (4.70) 
Ba = MACE, 8, y, A), (4.71) 


which is solved for é, 6, y, A 


4.4.3 
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Fourier Transform of the Moment-Generating Function 


In the Fourier transform method proposed by Milne—Ulma, the idea is to compute the moment- 

generating function for the portfolio return distribution and then to invert a Fourier transform 

to obtain the density f(V) of the portfolio P&L. From this density one then computes VaR 

from the area under the left tail of the density f corresponding to a given percentile. 
Consider the cumulative distribution function for the portfolio variation AV 


@(V) = P(AV < V) = Í  p°Œ)OV —AV(n))dr, (4.72) 


where p® is given by equation (4.42) and the integration is over the complete space of all 
risk-factor returns and @(-) is the unit step function. 
Differentiating this gives f(V): 


fV) = om f (4.73) 





Taking the derivative w.r.t. V inside the integral gives 
f(V) = Í exp(— tr"C7'r)ô(V — AV(r))dr, (4.74) 


1 
J det(27C) /R 


where we have used the property d@(x)/dx = 6(x). The Dirac delta function 6(x) is then 
written in terms of its integral representation, and, assuming a delta-gamma approximation, 
AV(r) © A'r + }r'Tr, we find that 


A are 
(Vp Í e™Y M(iu)du, (4.75) 
2T J—co 
i =-+/—1, where M is the moment-generating function (mgf) 


M(u) = exp(— ix” [C7! — ul ]x + uA”x)dx. (4.76) 


meo 


The mgf is given by a Gaussian integral and can be explicitly computed by using the 
integral identity 


n/2 
iv’ (A+ iB)~'v) (4.77) 


= ( 
SS CX 
det(A + iB) Py 


for any n x 1 vector v and n x n (complex) matrices A, B. Setting A = $C}, B = i5, and 
v= uA we find 


Í exp(—x’[A + iB]x + v’x)dx = 
R” 





M(u) = = (CA)! (I wort), (4.78) 


1 
exp ( 
det(I— uC) 


where I is the n x n identity matrix. Note that an equivalent expression also follows in terms 
of the transpose matrix (I— uf'C)? = I — uCT. 
The last step is to cast the given mgf into the following computationally tractable form: 


u2 b; 
M(u) = ma =u) exp |S 5 reel (4.79) 
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Here the b; are the components of the vector b given by 
b = OTUA, (4.80) 
where O is the matrix of eigenvectors of the symmetric matrix UTU”, 
O7(UrU’)O=A. (4.81) 


The latter equation gives the diagonal matrix A, whose diagonal elements define the given 
A; components. The matrix U is defined (as usual) in the Cholesky factorization of the 
covariance matrix C = UTU. 

We finally obtain the real part of the mgf, 


Re{M(iu)} = [[(1 + wa?) 4 exp[—(w2b?/2)/(1+wA2)] cos(@), (4.82) 
j=l 

and an identical expression for the imaginary part Im{M(iu)}, with the cosine function 

replaced by the sine function. The phase function is given by 


T2 u? n 
p= ge n= Z LAUREN): (4.83) 
j= j= 
The final form of the Fourier transform involves only real quantities: 
1 [es] 
fV) = A [cos(uV)Re{M(iu)} + sin(uV)Im{M(iu)}]du. (4.84) 
0 


This is a sum of cosine and sine transforms, which can be evaluated using a number of 
appropriate numerical routines for integrating one-dimensional oscillatory functions. It is 
particularly important to implement an algorithm that gives an accurate representation of the 
pdf within the left tail. From equation (4.73), then, VaR for a chosen percentile 1— a is 
obtained by evaluating the area under the left tail of f; i.e., the cumulative density gives 
@(V = —VaR) = 1—a. 


4.5 The Fast Convolution Method 


The risk-factor model and the pricing model provide the necessary ingredients to make 
equation (4.1), defining value-at-risk, meaningful. In this section, we present a new Fourier 
transform algorithm for computing value-at-risk. The method is different from existing Fourier 
methods (see [MU99, DP01, GHS02]) in that it does not assume that the characteristic function 
(or mgf) of the density is explicitly known. The method therefore has the advantage of greater 
freedom in choosing the risk-factor model. We present an extended example illustrating the 
performance of the algorithm and the importance of risk-factor models with fat tails. In later 
sections, we also extend the method to compute the gradient of value-at-risk. The section 
concludes with two computational examples. The first is a simple linear approximation to 
the change in value-at-risk with changes in portfolio composition. In the second example we 
hedge a derivatives portfolio by solving an optimization problem to minimize the value-at-risk. 
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The local dynamics of changes in portfolio value is again approximated by 
z r. lr 
All = Ot+A'r+ an Ir, 
and we approximate the value-at-risk by the solution to 
P(AII < —VaR) = 1—a. (4.85) 


Because AĴI is a quadratic function, it is generally easier to solve equation (4.85) than equa- 
tion (4.1). At the same time, since AĴ is locally accurate, the solution VaR to equation (4.85) 
is a good approximation to value-at-risk, provided the probability of large changes in the 
risk factors is relatively small. This is exemplified by the relative closeness of the A and AT 
distributions (see Figure 4.12). We return to study accuracy in Section 4.8. 

Equation (4.85) is solved in two steps. We find the pdf of ATI and compute the value-at- 
risk from this distribution. Consider a risk-factor model and a factorization of the portfolio 
of the type produced by Algorithm 1. That is, the joint distribution of the transformed risk 
factors has a product pdf 


P(X) = P(X) +> Pan), 


and the AI’-approximation is a sum of independent quadratic functions 


2 n À, 
AIl=5+)° 7,(x,) where m; = Ajx;+ rae 
i=l 
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FIGURE 4.12 AT distribution (solid line) compared to the A distribution (dashed line) for relatively 
small changes in risk factors. 
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4.5.1 


4.5.2 


Let p,,,(x) be the pdf of 7; fori=1,...,n. Then the pdf of AĴI has the form of a multiple 
convolution, !* 


Pañ = T2(Pa, * Pr). (4.86) 


The Probability Density Function of a Quadratic Random Variable 


Given a single risk factor x; with pdf p; and a quadratic portfolio component 


m; = Aix; + Ao 
l 2 l 


it is easy to derive the pdf p,, for 77;. Let x} (u) and x_(u) be the two roots of 77;(x) — u = 0, 


At + BION , (O72; < u). en) 


L 








(a) = — 


=x 





The pdf p,, is the derivative of the probability 
P(T; < u) = P(x € x, ([—0, u])) + P(x € x_([—09, u])). 


The sets x,([—o0o, u]) are empty for u < —(A‘)*/2A,, and otherwise 





Pa, (u) = p: (x4 (u))x (u) — pi(x_(u)) x1 (u). 
It follows that the pdf is 


0, if u < —(4A/)?/2À;, 
Pr (u) = ) pri@+ wtp- (u)) if u> — (A)? /2A;. 


SAHA 
The pdf has a singularity at u = —(A‘)?/2A,, i.e., at the critical point x = —A//À; of 77;. 


Discretization 


Let [—a, a], a > 0, be a closed interval. Consider a regular grid {€; = —a + jhe where 
h = 2a/N. Because p, has a singularity, it is necessary to use a discretization scheme that 


conserves probability (see Figure 4.13). Therefore, we take 


NGL 4 ` 1 pt2 
Pra © pra) =h) pð; (x) where p; = A Pr,(u)du (4.88) 
p &j—h/2 
where 6 is the delta function and 6;(x) = 6(x — €;). For convenience, let x) = —a and define 


pi =0. 


14The function f g defines the convolution product 


Fee) = f SOLE- y)dy 
The functional 7, is the shift operator defined by 


Taflx) = fa +a). 


4.5.3 
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FIGURE 4.13 Example of the importance of a discretization method that conserves probability density. 
Consider X?, where X is normal — a x° random variable with one degree of freedom. For the 
conservative method, the discrete pdf and cdf are close to the pdf and cdf of the x? random variable. 
In the nonconservative method, the pdf is sampled at the grid points to get a discrete approximation. 
As the graph for the cdf shows, the distribution function for the discrete approximation is not close to 
the original cdf. 


Accuracy and Convergence 


The accuracy of the discretized function depends on the density outside the interval [—a, a] 
and the number of grid points N. Suppose that 


P(m; < -—a+h/2) <e/2 and P(a,>a—h/2) < €/2, 


for i=1,...,n, and some small € > 0. On [—a, a], the discretization converges to the 
probability density in the weak sense. 

As a special case of convergence, the approximate cdf converges linearly to the exact 
cumulative density: If é, — h/2 < y < &, + h/2, then 





y NSL A 
f. Pa 4S pi a)]dx| < pth, 
€\—h/2 i j=0 
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and they agree exactly for y = é, + h/2. For the cdf, we have 
k 
J ipa,- PPO) ax] < e+ h max pt. 


4.5.4 The Computational Details 


To compute the coefficients in equation (4.88) is a bit messy. We consider the case A; 4 0 
ue case A; = 0 is simple). Considering an interval [€; — 4/2, €; + h/2], the computation of 
p} falls into one of three categories: 


(i) The polynomial 77;(x)—u does not have any zeros for any u € [E; — h/2, é; + h/2]. 
This gives 


pi =0. (4.89) 


(ii) The polynomial 7;(x)— u has a double zero for some u € [#; — h/2, é; + h/2]. It 
follows that 


x4 (&j+h/2) ; 
F f ae “G +h/2) pi(x)dx, if A; > 0, (4.90) 


Pi = x (E h/2) ; 
t frie un Pi(xdx, if A, <0. 


(iii) The polynomial 7;(x) — u has two distinct zeros for each u € [€;—h/2, €; + h/2]. This 
yields 


1 px4(ejt+h/2) 


J 
p= 
A Ix, (&-h/2) 


x—(€j—h/2) 


p(x)dx + vf ee 5 pias (4.91) 


4.5.5 Convolution with the Fast Fourier Transform 
Because the pdf for the AT-approximation ATI is a convolution product, it can be com- 


puted using ideas from Fourier analysis [GW98]. The convolution product and its Fourier 
transform” satisfies 


TEOS (ia) (x). 


15 The continuous Fourier transform of a function f(x) is 
~ oo 3 
Flo)= f e” fada. 
—o0 
The inverse Fourier transform is 


D= z f Rodo. 
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It is possible to compute an approximate pdf for the AT-approximation Afi by multi- 
plying and inverting the discrete Fourier transform of the coefficients of the discretized 
densities: !6 


N-1 n 
Pr, * k Py, (X) © Pee (x) = YO pj6j(x) where P} = (—-1)"Y hn" [] P}. (4.92) 
j=0 


k=1 


N-1 


The sequences {P/}%=! and {P/}>! are defined as the DFT of the sequences {p;o and 


j=0 j=0 
{pi in Pha and p? (x), ependi The DFT and the inverse DFT of a sequence with N 
points can be computed with the fast Fourier transform (FFT) using O(N log N) floating-point 
operations [BH95, GW98]. To compute the discrete approximation Pra (Ei) therefore requires 
a total of O(nN log N) floating-point operations. 

To prove this method works and that the computed distribution converges linearly as h 
and e decrease requires a bit of work. Essentially, the proof is an exercise in Fourier analysis 
and it proceeds in two steps. In the first step, we prove that, for a fixed interval [—a, a], 
equation (4.92) converges to the cyclic convolution. In the second step, we show that as a 
grows the cyclic convolution approximates the standard convolution. 

For the set of integrable functions with compact support in [—a, a], we define the cyclic 
convolution by the integral 


fasa)= f fe-yeo)dy xe [-a.al (4.93) 


and to be zero elsewhere. The function f?(x) is the periodic extension of f(x); in other words 
FS? (x) =, f(x — 2ak). It is easy to show that the cyclic convolution is commutative and 
linear. Furthermore, if the two functions are in L', we have 


fgh <lflilgh. 
mimicking the similar property for the standard convolution. 


Theorem 4.1. Assume f and g are Riemann integrable and have a finite number of discon- 
tinuities. Let 


N-1 N-1 
f ah fè, and =h gô; 
k=0 k=0 


16 Given a sequence {y,}¥>}, the discrete Fourier transform (DFT) {¥*}*>} and its inverse are defined by 


i oo 
y= > Vo*, k=0,1,..., N-1, 
NZ 


N-1 
y= Dyo*, k=0,1,..., N-1, 
j=0 


_j; 20 
where w= e`., 
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be discretizations (in the manner of Section 4.5.2). Then 


N-1 
FPR =hF ad, where Ai =(-1/hFIG!, 


k=0 
and fP ®g? > f ®g and fP ®g” —> (f ® g)? in the weak sense. 


Proof. We prove the two statements separately, starting with the first one. 


1. The cyclic convolution f? ® g? is 


Nn-1 /N-1 
Perski ( PSN e) ô, (x), 
k=0 \1=0 

and 


N-1 
d =h fe, k=0,1,...,N—1. 


1=0 


By applying the DFT with w =e! ®, it follows that 


N-l 
AS =h} aa 
j=0 
N-1 N-1 
= ho ™™ glo | Y fO-D-M2 gy “HU-D EN 
1=0 J= 
= (—1) hFE GE, 


2. It remains to show that f? ® g? > f @g and fP ® gP > (f ® g)? in the weak sense. 
Since f and g are piecewise continuous, we have, for all points of continuity, 


1] pxth/2 

i ee F(z—y)dy = f(z—x) + q(z—x, h)h, 
1 pxth/2 

z Jan SOND = BO) +7 hh, 


where q(-, h)h > O and r(-, h)h > 0 as h > 0. Let (x) be a smooth test function. Then 


ér+h/2 ér+h/2 
(/ P&D) x (/ s")dy) 
ér—h/2 é —h/2 
N. 
k: 


SS ES TEI Ea) 
= l=0 


0 


[ ee@spresar= > HG) ¥ 


1=0 


x (8C) + h(E, h)) . 
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We note that 
y hflG—E)e(&) TA 


and 
E Md E aE) > [CIF e(a)ae. 


If necessary, the finite number of points with discontinuities may be excluded without 
changing the limits. The remaining integrals involving r and q vanish, proving that 
J @g? > f@s. 

Since f ®g is Riemann integrable and piecewise continuous, it follows that 


1 p*+h/2 
h Í m f®gy)dy = f ® g(x) +hs(x, h) 


at all points of continuity. Hence, 
a s N-1 1 pët+h/2 
| Ce d= E netd o esa 
—a = h J —n/2 


> [ af es(xax 
ash>0. O 


By repeated applications of Theorem 4.1, we conclude that the right-hand side of equa- 
tion (4.92) converges weakly to the cyclic convolution of the truncated densities. Our remain- 
ing obligation is to show that the standard convolution can be approximated by the cyclic 
convolution. The intuition being that since a pdf decays in the tails, for a large enough interval 
the density of the overlapping regions in the cyclic convolution decreases. 

In the first lemma, we show that the error from restricting the pdfs to an interval containing 
the majority of the density is small. 


Lemma 4.1. Given the probability density functions p,, where i=1,...,n, let 
~ Pry), if-a<x<a, 
Pra) 5j” , 
0, otherwise, 
where | Pr, — Pr,lı < € for some € > 0 and for all i= 1,... ,n. Then, 


[Pr žk Dr, — Pr KD a li Sen. 
Proof. The statement follows immediately from the inequality | f * g|, < |fhlgh: 


IP, °° * Pa, — Pa, ¥°°°* Pa, li <Ir — Pa) #00 * Pali 
Hot [Dy +++ Pa, — Pa, ) li 


<en. 
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In the next lemma, we show that if two pdfs are small outside an interval around the origin 
and if the cyclic convolution is taken over a large enough interval, then the error is small. 


Lemma 4.2. Assume that the pdfs f and g satisfy 
1 1 
f(x) < Ci jar and g(x) < Coat’ (4.94) 


for some a > 0 for all x outside some interval [—b, b]. Suppose that a > 2b. Let f and g be 
the restrictions of f and g, respectively, to the interval |—a, a]. Then 


~ oe ae 1 
0< f@e(x)—fxe(x) < upya Oo) + C,D,(x)) 

where 

—a+x F 

Lapan fO)dy, RED, 

= a+x/2 
D,(x) fe fO)dy, if x <0, 

and 


Dexa en (y)dy, ifx>0, 
2(x) = a+x/2 à 
fiz g(y)dy, ifx <0. 


Proof. We prove the bound in two steps. Consider a point x in the interval [—a, a]. Suppose 
thatO<x<a. 


1. Then 
Faza =f flx—y)e0dy 
and 
Foza= f fa-y-2as0)dy+ f fO-2eWay. 


Since f and g are positive, the error satisfies 


0< Fax) -FaB =f" fa-y-2080)dy. 


2. Consider the interval [—a, —a+ x]. The functions g(y) and f(y) satisfy equation (4.94) 
in [—a, —a + x/2] and [-a+x/2, —a + x], respectively. Hence, 


—atx/2 1 


—at+x/2 
J, fle-y-2@eaveG [af y-2a)ay, 


—at+x a+x/2 1 
Í f(x—y—2a)g(y)dy < af ——g(x—y—2a)dy. 
—a+x/2 


on |y|o+t 
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Since |y|**! < (a—|x|/2)**!, by combining the two equalities we find 


Sati 1 
0s k f(x—y—2a)g(y)dy < CET oe) +C,D,(x)), 


as required. 


The proof for negative x is similar. The case x ¢ [—a, a] is trivial since both functions are 
zero outside the interval. O 


With the help of the two lemmas, we are in a position to relate the cyclic approximation 
to the standard convolution. 


Theorem 4.2. Let p,, for i=1,...,n be probability density functions that satisfy the 
assumptions in Lemma 4.1 and Lemma 4.2. Then 


IPa, *°°°* Pa, ~ Pm ®t @Pa, li = O(e+a*). 


Proof. Use Lemma 4.2 to bound the error from the cyclic convolution. Since the result is 
asymptotic, we may assume that a > 2b, so the the lemma can be applied. Note that 


Da, * t Pa, — Pa, @ OPa, = Pr * Pm — Pr © Pay) *"** Da, 
+++ Dy, @ D Dna * Px, — Pr, Pr) 


By the lemma and because |D(x)| < 1, it follows that 


Be ie w os 4 1 
[Pay * Pry, — Pm ® Pali < 2(C + Cu) f TE A 


=(C,+ Cia) (È) ~ =) : 


Hence, 
Dx x: Da, — Pr ®- -@D,,| = O(a“). 
By Lemma 4.1 and the triangle inequality, it follows that 


Pn #0 Pn, Pr, ®°* @By, |) = Ole +0). 


Theorems 4.1 and 4.2 show that equation (4.92) works. As later examples demonstrate, 
the rate of convergence for numerical experiments appears to be linear (see upcoming 
Figure 4.14). It is therefore reasonable to guess that a stronger version of Theorem 4.1 may 
be true. 
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4.5.6 


4.5.7 


Computing Value-at-Risk 


In this section, we discuss how to compute value-at-risk from discrete approximation (4.92) 
to the pdf of the AT-approximation. Recall that the value-at-risk is defined by nonlin- 
ear equation (4.1), and we approximate it by the solution to equation (4.85). Given the 
pdf pR) = leary p;6;(x), we simply add the coefficients to get the cdf at the grid 
points 


N-1 
Pp 
P ilé) = ary +h p; and — Pyag(Ey) = Paila) = hpo th DY p, (4.95) 
isk j=l 


Since the first point of the grid corresponds to the interval [—a, —a + h/2]U [a — h/2, a], we 
choose to assign half of the density to the right-end grid point, a, and half to the left-end 
grid point, —a. Value-at-risk can then be computed by interpolating the cdf. Since the cdf 
is an increasing function, we search for the index k such that Pig (é) < l—a@ < Pag (E41), 
and therefore —VaR — & is in the interval [&,, €,,,]. The linear interpolant to the inverse 
cdf is 


P— Pai (&) 
P ailé) — Pañ (é 





i= a+( ,) i PEPPE. 


The desired approximation to value-at-risk is VaR = —L(1 — q) — £. 


Richardson’s Extrapolation Improves Accuracy 


In practice, we have found that the observed rate of convergence can be improved with a 
step of Richardson’s extrapolation. In our implementation, we compute two solutions VaR ; 
and VaR, on a fine grid with step size h and a coarse grid with step size 2h, respectively. 
Richardson’s extrapolation then gives the solution!” 


VaR = 2VaR; — VaR... 


Although Richardson’s extrapolation can be extended to eliminate higher-order errors, we 
have found that additional levels of extrapolation do not lead to further improvements. The 
steps of the fast convolution method for value-at-risk are summarized in Algorithm 2. 


17 Suppose we want to compute y. If we have two approximations satisfying 


y=ytch+o(h), 
y2 =ytc(2h) +o(h), 





then the linear error term can be eliminated with Richardson’s extrapolation 
2yı — y2 = y+ o(h). 


It is easy to derive similar formulas for higher-order terms. 
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Algorithm 2 
Fast Convolution Method for Value-at-Risk 


Input: A AT-approximation. A time series of daily returns. A confidence level 
0 <a <1. A number of grid points N and a bound for the grid interval a. 
Output: The value-at-risk for one day with confidence level a. 


¢ Estimate the parameters and factorize the AT -approximation with Algorithm 1. 
e Create the grid. Define {€, = —a+hk}\), where h = 2a/N. 
e Discretize the densities, 


1 pëj+h/2 


N-1 
p? =h > p76; where p! = — p,(u)du. 
| h Jg, -hn/2 


e Compute the density for the AT -approximation with the FFT: 


N-1 ; 
Pañ =h} p;ô; where Pİ = (-1)-Y a! TT P’. 
j=0 


k=1 
e Compute the discrete cdf Py; and find k such that 


Palé) <l-as Pai (Ei41)- 
e Compute the linear interpolant L(-) for the inverse over [E,, &,,,] and let 
VaR = —L(1—a)—#. 


e Repeat from step 2 for a grid with step size 2h. Extrapolate to get a more 
accurate approximation. 











The advantage of Richardson’s extrapolation is clearly illustrated by the computational 
example in Figure 4.14. The four graphs show the error as a function of the step size h. The 
functions are 

(i) ATL=X,+X, 
(ii) ATI =X? + X3, 
(iii) ATI = —(X? + X2), 
(iv) ATI = X? 4X? =X? = X? +X,, 
and the random variables X, are normal. The graphs show the error for confidence level œ 
equal to 1%, 5%, 10%, and 20% and for value-at-risk computed with and without Richard- 


son’s extrapolation. The error is estimated as the difference between the value-at-risk for 
consecutive grid sizes. The rate of convergence was computed using linear regression. 


280 CHAPTER 4. Numerical methods for value-at-risk 


4.5.8 


























(ii) 























(iii) (iv) 
FIGURE 4.14 Error in value-at-risk as a function of step size for four abstract problems. Each graph 
shows the error for the confidence levels 1%, 5%, 10%, and 20%, and for value-at-risk computed 
with and without Richardson’s extrapolation. Extrapolation is superior since the errors are smaller, 


and the observed rate of convergence improves from 1 to between 1.5 and 2, depending on the 
problem. 


Without extrapolation, the observed rate of convergence is very close to 1, for all four prob- 
lems and all confidence levels. When extrapolation is used, all problems show a faster rate 
of convergence, but the systematic relationship is less clear. The estimated rates of con- 
vergence range from approximately 1.5 for problems (ii) and (iii) to approximately 2 for 
problem (i). 


Computational Complexity 


The number of floating point operations for Algorithm 2 is O(n? min(d, n) + nN log N), 
where n is the number of risk factors, d is the number of dates in the time series, and N 
is the grid size. Figure 4.15 shows the computation time for portfolios for increasing n; the 
remaining parameters are fixed, with d = 1000 and N = 4096. The figure shows that the time 
is essentially linear in n. In addition, we note that a large portion of the computation time is 
spent in the parameter estimation step. As expected, the parameter estimation for the Parzen 
model is much slower than for the asymmetric t model, which in turn is slower than the 
normal model. 
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FIGURE 4.15 Computation time versus the number risk factors for the three return models. The 
parameter estimation takes a large portion of the total time. 


4.6 Examples 


4.6.1 


Fat Tails and Value-at-Risk 


We illustrate the performance of the algorithm with an example. The example also demon- 
strates the importance of a return model that incorporates fat tails and that includes information 
about the curvature of the function for the portfolio value. Consider a portfolio containing 
one European call option on each of BCE and Canadian Tire. The options are at-the-money 
and have 3 months to maturity. The Black-Scholes price of the options are $3.23 and $1.39. 
The Hessian in the AT -approximation is 


r= 99.0967 0 
~ 0 42.6138 |” 


The portfolio is similar to the example in Section 4.3.2, but it has shorter time to maturity, 
which increases the curvature. 

An investor who has sold this portfolio will see her holdings decrease in value if the stock 
prices increase. To hedge the portfolio, she might take a linear position that offsets the A of 
the portfolio. To examine how the value-at-risk changes with the A vector, we computed the 
95% and 99% value-at-risk on a grid with —10 < A; < 10, where i= 1, 2. Figures 4.16 and 
4.17 show the level sets for value-at-risk. Each graph is computed with a 30 x 30 grid for A. 
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The dynamics of portfolio value are approximated by a linear function, and the returns on the 
risk factors are modeled by a multivariate normal with zero mean. The remaining six graphs 
in Figures 4.16 and 4.17 were computed with the fast convolution method with N = 4096 
grid points. The dynamics of portfolio value are approximated by the same AT’-approximation 
in all six simulations, and the risk factors are modeled using the three models introduced in 
Section 4.1. We see that the linear model oversimplifies the problem. In particular, it suggests 
that the risk is eliminated completely by the hedging strategy, which is not true. From the 
graph it is clear that it also underestimates the risk away from the origin. 


A-Approximation and Normal Returns 
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(i) Value-at-risk for œ = 95% (ii) Value-at-risk for œ = 99% 


AT-Approximation and Normal Returns 
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(i) Value-at-risk for œ = 95% (ii) Value-at-risk for œ = 99% 


FIGURE 4.16 Part I. Value-at-risk for a short position in an option portfolio as a function of A, and 
A,. The horizontal axis is A}, the linear position in BCE; the vertical axis is A), the linear position 
in CTRa. 
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AT-Approximation and Asymmetric t Returns 





10 























-10 -8 -6 4 -2 0 2 4 6 8 10 
(i) Value-at-risk for a = 95% (ii) Value-at-risk for œ = 99% 


AT-Approximation and Parzen Estimate for Returns 
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(i) Value-at-risk for œ = 95% (ii) Value-at-risk for «= 99% 
FIGURE 4.17 Part II. Value-at-risk for a short position in an option portfolio as a function of A, and 


A,. The horizontal axis is A,, the linear position in BCE; the vertical axis is A,, the linear position 
in CTRa. 


Although the remaining six graphs are more in agreement with each other, we can see 
some interesting differences. Again the estimate for the value-at-risk is smaller for the normal 
model as compared to the asymmetric t and Parzen models. For A = 0, the relative differences 
are approximately 16% (38%) and 80% (92%) for the asymmetric t model (Parzen model) and 
for the 95% and 99% value-at-risk, respectively. The differences between the asymmetric t 
and Parzen models are about 19% and 7% for the 95% and 99% value-at-risk, respectively. 
The Parzen model gives a larger value-at-risk as compared to both of the other models. The 
level sets of the normal and asymmetric t models are elliptical, whereas the level sets for 
the Parzen model display less symmetry. The fat tails are primarily a concern for portfolios 
with negative curvature. Figure 4.18 shows the 95% value-at-risk for a long position in the 
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4.6.2 









































(iii) Asymmetric t returns (iv) Parzen estimate for returns 


FIGURE 4.18 Part IN. The 95% value-at-risk for a long position in an option portfolio as a function 
of A, and A,. The horizontal axis is A}, the linear position in BCE; the vertical axis is A,, the linear 
position in CTRa. 


example portfolio. The differences between the four graphs are much smaller, and the last 
three are almost identically close to the origin. Of course, this just confirms that buying the 
call options is much less risky than selling them. 


So Which Result Can We Trust? 


To better understand the simulations results, we can take a closer look at a long position 
in the delta-hedged portfolio, the portfolio with A = 0. Recall that if the value-at-risk is 
correct, we expect to have approximately 5 (25) losses exceeding the 99% (95%) value-at-risk 
for a sample of 500 returns. We repeated the value-at-risk simulation for 500 consecutive 
days and computed the number of losses greater than value-at-risk over the 500 returns used 
in the calculation. Figure 4.19 shows histograms for the normal, asymmetric t, and Parzen 
models. 

From the examples in Section 4.1, we know that the normal model does not produce 
an accurate model for the tails. This is confirmed by the simulation, since the number 
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Normal returns 
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FIGURE 4.19 Histograms of the number of losses larger than value-at-risk for a = 95% and a = 99%. 
Each graph shows two superimposed histograms, where the left “bump” is the result for 99% value-at- 
risk and the right bump is the result for the 95% value-at-risk. 


of losses greater than value-at-risk deviates from the expected values. The histograms for 
the asymmetric t model are centered close to the expected values. The Parzen model pro- 
duces a good estimate for the 99% value-at-risk, but it seems to overestimate the smaller 
confidence level. We conclude that the asymmetric t and Parzen models are preferable to 
the normal model. Furthermore, for this example, both models give acceptable results for 
the 99% value-at-risk, and the asymmetric t model produces a better estimate for the 95% 
value-at-risk. 


Computing the Gradient of Value-at-Risk 


For small portfolios, such as the one in our example, it is possible to understand how value-at- 
risk changes with portfolio composition by computing it many times, varying the parameters, 
and visualizing the result. For large portfolios and for increasing number of parameters, this 
is a time-consuming strategy, and the result becomes harder to interpret. In this section, we 
extend the fast convolution method to compute the gradient of value-at-risk. It is an interesting 
problem because the gradient gives local information about how the value-at-risk changes 
with portfolio composition. Gradient information is important to understand and evaluate 
decisions about changes in the portfolio. In [MR98] Mausser and Rosen review applications 
and methods for value-at-risk gradients. Monte Carlo methods to compute value-at-risk have 
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4.6.4 


been proposed independently by several authors; see, for example, the methods developed by 
Păun [Pău99] and Mausser and Rosen [MR98]. For the linear model with normal returns, the 
gradient may be interpreted in terms of “risk contributions” from instruments or risk factors; 
see, for example, the paper by Hallerbach [Hal99]. 

Recall that value-at-risk is defined by nonlinear equation (4.1). In the fast convolution 
method, it is approximated by the solution to equation (4.85). We can reformulate equa- 
tion (4.85) as a one-dimensional integral over the pdf of AIT, 


—VaR 
3 Pag (x)dx = l—-a. 


Then the gradient (i.e., gradients w.r.t., the “Greeks”) can be computed by implicit differen- 
tiation: 














IVaR 1 -VaR Apai 
aR _ Pail gy, (4.96) 
3O pax(—WaR)/-. 60 
aVaR 1 -VaR Apai 
So Pañ Jy, (4.97) 
3A, — pag( VaR) J-e aA, 
VaR -VaR Op = 


dx. (4.98) 








dA;,  Paŭ(—VaR) [. ai; 
Of course, rather than the derivatives for the parameters in the portfolio factorization, we 
want the derivatives for the parameters in the original AT-approximation or the portfolio 
positions 6,. Fortunately, these can be computed from equations (4.96)-(4.98).'8 


The Value-at-Risk Gradient and Portfolio Composition 


Suppose that the VaR gradient with respect to the parameters in the AI’-approximation are 
known. The gradient with respect to 0;, the quantity invested in the ith security, follows from 
the chain rule: 


aVaR VaR IO." AVaR ðA, A IVaR AT, 
= ee 4.99 
00, dO 00, 2 ðA, 06, PAZ ay; 96; oe 


i k=1 k=l j=l 








The derivatives for the parameters of the AI-approximation can in turn be computed 
from the derivatives w.r.t. parameters in the portfolio factorization, equations (4.31), (4.34), 
and (4.35). The © derivative, in equation (4.96), immediately gives 


aVaR | 
30 





For the remaining two derivatives, it follows after some calculation that the gradient vector 


for A; is 
dVaR |" dVaR |" 
=A -f 4.100 
| dA; g (| ôA; I a) ( ) 


18Tn the gradient computation, we have left out the contribution to the gradient from the parameter estimates. 
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and the Jacobian matrix for I; is 


aR |" aR |” A~ VaR |” aa |" ~AT 
e (eT o ola ar) eo 


YD ILj= y Ji, j= i,j= 








Here we have used the shorthand notation [v,]j_; = (vi, -> v,)" and [M;;]} _; =M for any 


nxn matrix M with elements M,;. 


n 
ije 


Computing the Gradient 


Consider integrals (4.96)-(4.98). We note that the density p,—(—VaR) is directly available 
in Algorithm 2; it can be computed by interpolation. Therefore, the task that remains is to 
approximate the two integrands in equations (4.97) and (4.98). 

The derivative with respect to A; is 


OP sit OD x, 








ok Px,): 
Similarly, the derivative with respect to the diagonal element A,; is 


Op fl Pr; 
TA Ta(Pm A K Pa): 


U 


To find the derivatives for A;;, where i # j, we have to resort to a slightly different 
technique.!? The matrix A is a Hessian and hence is symmetric. Therefore, we only have 
to consider derivatives in directions that preserve symmetry. Consider a perturbation in the 
direction E;;+ E; (i # j). Differentiating 


GD a) 
alt IC a EEEo 


Since the curve is tangent to E; 


we get 


d 
dt 





j+ E; we consider pñ along this curve as a function of t: 


Pañ = Pr, * °° * Pr (Ault), Aj) + Pa, (AiO), AGO) * + Pa 


-1 

1 A ae, 0 ead eae, 

a. aa) 
t=0 


19 Differentiating a function f: R” — R is often simplified if the derivation is carried out in a convenient basis. 
The derivative in the direction of v at x satisfies 


For this curve, we get 


d 


d 
A’(t) = — 
T (t) 


hah dt 








Vaf =v" Vf(x) 
where Vf(x) is the gradient of f at x. Hence, if b,,...,b, is a basis for R”, then the gradient can be computed 
from the derivatives V}, f by 
Vf(x) = Bo! [Vf]; 


where the rows of the matrix B are the basis vectors. 
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The derivative of pj in the direction of T = (Ey + E;) is the same as the derivative of 
Pañ along the curve renormalized to compensate for the curve’s not having unit speed: 


d| Tam * 0 * Pa, (u(t), A; HA) # + Dy, (Ajj (1), Aj — AN # + * Dy) 


V. ~= 
rPañ = ae Ar A) 





Hence, it follows that 








OP ait = 1 AN IP ait A’ OP aii 
ðA UAA) \ 7 ôA; ' ôA, ` 
Algorithm 3 


Fast Convolution Method for Value-at-Risk Gradient 


Input: A AT-approximation. A time series of daily returns. A confidence level 

0<a<1.A number of grid points N and a bound for the grid interval a. 

Output: The value-at-risk for one day with confidence level a. The gradient of 
value-at-risk for the parameters of the AY-approximation. 


e Compute value-at-risk with Algorithm 2. 
e Compute the discretized partial derivative for pz: 


dpe N-1 


ih JS.. 
ðA, 2i j 





e Convolve the functions with the FFT: 





3p. =N? N-1 : ; : ; 
Pail) SAE rô; where R = (-1)- pn" oiT] P} 
ôA; Pr; kži 


e Integrate over [—a, —VaR] by adding the coefficients {r; A and linear inter- 
polation. Let I be this approximation. 

e Compute pyy(—VaR) with linear interpolation for Priv 

e Set the vector of components: 


VaR E 1 i 
ðA; a Pañ(— VaR) l 


e Repeat from step 2 for each A; and for A,,. 

e Repeat for a grid with step size 2h. Extrapolate both value-at-risk and the 
gradients to get more accurate answers. 

e Perform the change of coordinates using equations (4.100) and (4.101). 
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The computation boils down to finding derivatives of the pdf p,, with respect to Aj and 
A,;. In our implementation, we approximate the derivatives by differentiating the coefficients 
of the discretized pdf ph , i.e., by differentiating equations (4.89)-(4.91) with respect to A; 
and A,;. This procedure gives two discretized functions: 





dp? N-11 dp? N-1 
Tih JS. and —~=h J8.. 
JA; 2 ae aA, 2 a 


Hence, we can compute discrete approximations to the partial derivatives with the DFT. 
The algorithm is identical to the corresponding step, in equation (4.92), in the value-at-risk 
algorithm. Finally, integrals (4.97) and (4.98) are approximated by summing the coefficients 
and with linear interpolation over the final interval. The integration step is similar to equa- 
tion (4.95). The complete computational procedure is summarized in Algorithm 3. 


Sensitivity Analysis and the Linear Approximation 


Consider a portfolio with two European call options on each of BCE and Canadian Tire. Both 
options are at-the-money and mature in 3 months. Let 6, and 6, be the number of contracts 
held for each of the two options. When priced using the Black-Scholes model, the value of 
the portfolio is 


II = 3.23-6,+1.39- 6. 
The AT-approximation is given by 
© = —0.0277- 0, —0.0119- 63, 


ive Bee . J 


10.4725 - 0, 
r — [99-0967 4, 0 
= 0 42.6138 - 0, | ` 


As an example of a gradient calculation, we computed value-at-risk for a 25 x 25 grid of 
(01, 02) portfolios. To understand the sensitivities of value-at-risk to the portfolio parameters, 
we also computed the gradient at each grid point. Figure 4.20 shows the computed gradient 
field superimposed over the level curves of value-at-risk. The vectors show the direction of 
largest sensitivity to changes in the portfolio. The six graphs show the results for the 95% 
and 99% value-at-risk and for the three return models introduced in Section 4.1. From the 
figure we make three observations. First, for all plots the computed vector field agrees with 
the level sets and shows that changes in 6,, the position in the BCE option, has the most 
impact on the risk. Second, similar to the previous example, the asymmetric t and Parzen 
models for return give a larger estimate of the risk as compared to the normal model. Third, 
the level sets for the Parzen model are less smooth, but the computed gradients still seem to 
agree quite well with the macro scale of change for the function. For value-at-risk simulations 
the lack of smoothness of computed value-at-risk is a minor concern, but when applied to 
optimization problems this is a serious shortcoming. 

Consider portfolios where 0, = 1 is fixed and 0, varies, i.e., the portfolios on a horizontal 
line in each of the graphs in Figure 4.20. For value-at-risk as a function of 6,, the derivative of 
value-at-risk gives a linear approximation. Figure 4.21 shows the value-at-risk and the linear 
approximation computed for 6, = —0.2. In all six cases, the linear approximations accurately 
describe the local dynamics of the value-at-risk. Neglecting to differentiate the market model 
does not lead to an inaccurate derivative. 
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95% Value-at-Risk 99% Value-at-Risk 
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FIGURE 4.20 Level sets for value-at-risk and the gradient field for value-at-risk. The horizontal axis 
is 0}, the position in the BCE call option, and the vertical axis is 0,, the position in the Canadian Tire 
option. 
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(iii) Asymmetric t returns 
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FIGURE 4.21 Linear approximations to the 95% and 99% value-at-risk as a function of 6,, the position 
in the call option on BCE. The linear approximation is computed for 6, = —0.2. 


Hedging with Value-at-Risk 


We conclude this section with an optimization example. Similar to the previous examples, 
we consider a portfolio with a long position in one European call option each on BCE and 
Canadian Tire. The two options both mature in 3 months and are at-the-money. The TSE35 


292 CHAPTER 4 ~ Numerical methods for value-at-risk 


4.6.8 


is an index with 35 stocks trading on the Toronto Stock Exchange. Extend the model with 
two more securities: the index itself and an at-the-money call option on the index. Since the 
TSE35 includes both BCE and Canadian Tire and there is significant positive correlation 
of the returns, it is possible to find a position that reduces the value-at-risk of the original 
portfolio. Let 6, and 0, be the number of index and call options on the index in the portfolio. 
The AT-approximation for this portfolio is given by 


© = 0.0396 — 0.2354 - 45, 


~24,3797 
A= —10.4725 
564.75: 0, +321.71-0, 
99.0967 0 0 
r= 0 42.6138 0 
0 0  2233.2-4, 


This leads to the following optimization problem: 
min VaR(6,, 63). 
01,02 


The value-at-risk surface has a flat fold; see Figure 4.22. The portfolios along the fold 
all have approximately the same value-at-risk, which corresponds to portfolios where A, is 
constant. To study the performance of the fast convolution method when used in combination 
with optimization software, we computed the minimum for each of the three return models.”° 
The computed solutions are marked in the plots in Figure 4.22. From studying the surface 
and inspecting the iterations, we note that the computed solutions are different and that the 
solutions for all three problems are very close to degenerate. Hence, hedging the portfolio 
using both index and index options does not lead to any significant reduction in the value-at- 
risk as compared to using the index alone. 


Adding Stochastic Volatility 


The picture changes when the model is extended by making volatility a risk factor. We chose 
to consider a simple extension. Suppose that the volatility for all three risk factors is the 
same; i.e., the changes in volatilities satisfy 


Ao! Ao? A Ao? 


be = 0p Se, pag 
er) er er 





and we use the Black-Scholes equation to extract a AT -approximation. Although this model 
is too simple to be of use in practice, it captures the main property of a stochastic volatility 
model and introduces a risk factor that can only be hedged using the option. We therefore 
believe that the qualitative properties of the example are correct, and for the purpose of 
exploring value-at-risk optimization it is therefore an appropriate model problem. 


20We used the quasi-Newton method. 
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Constant Volatility Stochastic Volatility 


Computed minimum: 0, =0.19, 0,=-0.21 , VaR =2.1379 Computed minimum: 9, =-0.04, 0,=0.1 9, VaR = 1.1096 










oes —~0.5 


Stock, 0, -0.57 Option, 0, Stock, 0, -0.5 -1 Option, 6, 
(i) Normal returns (ii) Normal returns 
Computed minimum: 6, =0.06, 6,=0.01, VaR = 2.4475 Computed minimum: 6, =-0.02, 6,=0.1 8, VaR = 1.2660 


om eae D 





0 08 
Stock, 0, -0.5 -1 Option, 6, Stock, 0, -0.5-1 Option, 6, 
(iii) Asymmetric t returns (iv) Asymmetric t returns 
Computed minimum: 0, =-0.15, 6,=0.40, VaR = 2.7398 Computed minimum: 0, =-0.00, 6,=0.15, VaR = 1.4421 





oN oan 
Se 8 = 
Stock, 0, -0.551 Option, 0, Stock, 0, -0.5 -1 Option, 0, 
(v) Parzen estimate for returns (vi) Parzen estimate for returns 


FIGURE 4.22 The 99% value-at-risk surface as a function of 0}, the position in TSE35, and 0,, the 
position in call options on TSE35. The point marked is the computed solution to the optimization 
problem. Volatility as a risk factor changes the shape of the value-at-risk surface and makes the minimum 
well defined. 
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If we include first-order volatility risk, this gives anew AT -approximation for the portfolio: 


© = 0.0396 — 0.2354 - 0,, 


~24.3797 
—10.4725 

564.75 - 6, +321.71-0, |” 

—4,1963 + 22.0409 - 0, 


—99.0967 0 0 0 

r= 0 —42.6138 0 0 
0 0 2233.2-0, 0 

0 0 0 0 


In Figure 4.22, we see that the value-at-risk surfaces for the stochastic volatility model 
are different from the model without volatility risk. In the stochastic volatility model, the 
degenerate minimum is replaced by a well-defined minimum. Furthermore, the portfolio with 
optimal value-at-risk combines a position in the index and the index option to reduce the risk, 
and using only the index would give a less efficient hedge. Finally, the minima computed 
using optimization software are close to each other for all three return models. 

From the example we can draw several interesting conclusions. First, the example shows 
that value-at-risk leads to nontrivial optimization problems. The fast convolution method 
provides an efficient basis for solving hedging problems with value-at-risk as the risk measure. 
Second, the shape of the value-at-risk surface changes when volatility risk is included in the 
model. In the stochastic volatility model, options are important to reduce the value-at-risk. 
Third, our experience is that optimization with the Parzen model is less reliable than the 
other two return models. The reason is that the surface for the Parzen model has small-scale 
variations caused by the nonparametric density estimator. When it is used in an optimization 
algorithm, the lack of smoothness makes the use of finite-difference approximations to the 
derivatives less stable. To some extent this can be handled by varying the tolerances used in 
these computations, but it does not change the fact that the steps and stopping criteria are 
less reliable, and special care must be taken to check the accuracy of the computed solution. 


4.7 Risk-Factor Aggregation and Dimension Reduction 


For many portfolios, a few main directions determine the dominant behavior of the dynamics 
of the portfolio value. This is a combined effect of correlation of risk-factor returns and the 
quantities of each security held in the portfolio. Therefore, it is natural to search for a simpler 
approximation depending on fewer factors that is close to the original model. In this section, 
we develop two portfolio-dependent methods for dimension reduction (see also [AJW02]). 
This problem has been examined by other authors; e.g., Hull [Hul00] shows how to 
use principal component analysis in an interest rate model; Kreinin et al. [KMRZ98] present 
a principal-component—based method for linear portfolios with normally distributed risk 
factors; and Reimers and Zerbs [RZ98] develop a reduction method where asset blocks 
are represented by their dominant principal components and cross-block covariances by the 
covariance for the largest principal component of each block. The method presented here take 
a AT -approximation and compute a new approximation of lower dimensionality that is close 
to the original function. The section concludes with two examples. In the first example, we 
compare the performance of the methods for a sample portfolio. We conclude that the method 
based on mean square error is more accurate, easier to implement, and slightly more efficient. 


4.7.1 
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In the second example, we apply the dimension reduction method to an optimization problem. 
This experiment shows that dimension reduction can be effective in reducing computation time 
while maintaining good accuracy. Parts of these results have appeared in Albanese, Jackson, 
and Wiberg [AJW02], but the numerical experiments presented here are more extensive and 
the conclusions about nonnormal models are new. 

A AT-approximation can be written as 


sh 1 n À; 
All=2Z+X"A' + zX AX = Z+) (A,X; + 5 Xi): (4.102) 
i=l 
The nx 1 vector X is an affine transformation of the original risk-factor returns (see 
Section 4.3). The vector of risk-factor returns is a random variable, and the the transformed 
random vector X satisfies 


E[X]=0 and E[XX"] =I. 


If the number of risk factors n is large, the time to compute value-at-risk is large. The 
objective in dimension reduction is to find a AT’-approximation AIT, that captures the main 
properties of AII, but AI, depends on k <n dimensions. Successful dimension reduc- 
tion reduces the time to compute value-at-risk. This is particularly important for problems 
where value-at-risk must be computed repeatedly, as, for example, in solving optimization 
problems. 2 E 

_ The strategy proposed here is to restrict ATI to a subspace such that ATI and the restriction 
ATI, are close. Let P be a projection”! onto the subspace spanned by the orthonormal columns 
of an n x k matrix Q,, and let P+ be the projection onto the complementary subspace spanned 
by the orthonormal columns of Q,. Let z = (z,, z,) be defined by 


X=PX+PtX = QZ, + QZ. (4.103) 


Based on this factorization of the risk-factor space, we conclude that the reduced approxima- 
tion has the form 


ATI(X) ~ AT(PX) = ATL, (z,). (4.104) 


In the sections that follow, we present two methods for finding a projection so that All 
and AII, are close. The methods are motivated by two different views of the meaning of 
closeness. The first method solves the problem by finding a lower-dimension approximation 
with a small mean square error. The second method uses the identification of quadratic forms 
with matrices and solves the problem, after rescaling the variables, by finding a lower-rank 
matrix close to the original AT’-approximation. 


Method 1: Reduction with Small Mean Square Error 
In Method 1, we find Ail, in equation (4.104) by insisting that the mean square error 


E{(AIl — ATl,)?] be small. To motivate the algorithm for dimension reduction, we need the 
following lemma. 


21 A projection is a Hermitian matrix P such that P? = P. 
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Lemma 4.3. Let A be an n x n matrix with x, a, b as n x 1 vectors. Suppose that 


max max E[(a’ x)*(b"x)’] < B. 
lal2=1 |bh=1 


Then 
E[(x" Ax)’] < nB|Al3. 
||» denotes the Frobenius norm of a matrix; i.e., |Alp = (X; j=: [Au P). 
Proof. From the Cauchy—Schwartz inequality, it follows that 


E| (x7 Ax)?] < E[(x"x)(x" A’ Ax)]. 


Since ATA is symmetric, there is an orthogonal matrix Q such that ATA = Q7>?O where > 
is diagonal. Hence, 


El(x"x)(x" AT Ax)] = E[(x" Q" Ox)(x" QTE" Ox). 


Define y; = q; x, where q/ is the ith row vector of Q. By assumption, we have E[y;y;] < £ 
for all i and j, and it follows that 


ELOT YO Xy] = E e WO o; D| 
k l 
=} 0; 2 Eyy] 
i k 
SnD yO, 


Since `; o? =|Al;, the lemma follows. O 


Suppose that the 4th-order moments in the lemma are bounded and that we have an 
estimate $ for this bound. Consider a tolerance € > 0. By reordering the components of X, 
we can order the eigenvalues, the diagonal elements of A, such that |A,| > |A,,,| for all i. Let 
k be the smallest index for which 


yd Se (4.105) 


i=k+1 


Partition X = (X,, X,), with X, containing the first k components. We can then write equa- 
tion (4.102) as 


= = Ai] 1 A, 0} [X 
All=2 rofl Jia 


Hoxgja +51%40]5 a,x) 


The contribution to the mean square error from A, is small relative to €, but the effect on the 
gain from A‘, may still be large. With a simple trick we can keep all the information in Aj. 
Let V = [v,, V,] be an orthogonal matrix with the first column v; = A4/|A}|,. Since Aj is 
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orthogonal to the column vectors in V,, we define the (k + 1)-vector z, and the (n — k —1)- 
vector Z, in equation (4.103) by 


x 
z= Ea and z, = V3x. 
Hence, the reduced AT -approximation is 
~ A 1 A 0 
L uT 1 T| A 
A RERE fen a gA | 0 A T 


This dimension reduction method is summarized in Algorithm 4. 
To relate the reduced AT-approximation to the mean square error, we observe that the 
residual is a pure quadratic term: 
0 0 0 
T Zi 
T T | 





It is easy to show that 


<|V7A,V|,. 


F 





0 vi AV, 
VAY, Vi ALV, 


By the unitary invariance of the Frobenius norm and criterion (4.105), we can apply the 
lemma to prove the bound for the mean square error summarized in the following theorem. 





Algorithm 4 
Dimension Reduction, Method 1 


d 
i=1 


Input: A AI-approximation. A time series {r; 
e>0. 
Output: Parameter estimates for the market model. A factorization of the reduced 
AT -approximation. 


of daily returns. A tolerance 


e Compute estimates for the mean jt and the covariance matrix C. 
e Solve the eigenvalue problem 


A=A'TA, 
C=AA’, 


Order the eigenvalues so that |A;| > |À]. 
e Compute 


= at, , latpe 
==Ort+p’ At sATA, 


A’ =A™(A+Tp). 
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e Find the smallest integer k such that 
yA Se. 
i=k+1 

e Compute 

A, =[A(1: k), |A(K+1:n)|,], 

vy, =A(kK+1:n)/|A(KK+1:n)|,, 

A(1:k,1:k) 0 
= 0 vTA(k+1:n,k+1:n)vi |? 
1 n, :n)VI 


e For each Z, estimate the remaining parameters from the transformed returns 
{Al (r;— p)}2, and {vi A~ (r; — #)}4,, using the methods in Section 4.1. 








Theorem 4.3. Let € > 0 and B > 0, with x, a, b as nx 1 vectors such that 


max max E[(a’ x)?(b" x)’] < B. 
lal2=1 |bh=1 


Then the AT -approximation ATl, produced by Algorithm 4 satisfies 


zañ- añ] <P 


Method 2: Reduction by Low-Rank Approximation 


The random variables X, in equation (4.102) have zero mean, are uncorrelated, and have 
variance 1. Therefore, the impact of each random variable is approximately the same. The 
AT-approximation can be written as a quadratic form:” 


CEERI ae A AIX 
siiese toe IE 


Therefore, we can define the distance between two AT’-approximations, with the same constant 
term Æ, as 


|Afi — Afi,| = lay Of -| a (4.106) 





2 
In Method 2, this is the definition of closeness we use for dimension reduction. 
2Tn practice, it is not necessary to transform the problem all the way to form (4.102). It is sufficient to compute 


the Cholesky factor of the covariance matrix and transform the AT -approximation accordingly. This way the number 
of eigenvalue problems solved in upcoming Algorithm 5 can be reduced from three to two. 
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Consider the Schur decomposition for Af, 


ka A’ 


(a’)? J Ta 


where O is an orthogonal matrix and > is a diagonal matrix. The diagonal elements of $ 
are the eigenvalues, and we may assume that they are ordered in decreasing absolute value 
|o,|>... > |o,|. Let the n xn diagonal matrix $, be defined by 


>, = diag(o,,...,0;,0,...,0). 
The Schmidt—Mirsky theorem says that OY,O7 solves the minimization problem 


min |AÑ -— B], = |AI-O,07|, = |041| 


rank(B)=k 


(for a proof, see, for example, [SS90 p. 208 or GL89 p. 71]). 
Given a tolerance € > 0, the Schmidt—Mirsky theorem gives a tool to find the optimal 
function AII. Let k be the smallest k such that 


[Oyl < €. 
Then 
|All -O,07|, < €. 


The terms of the reduced function AT, can be computed from the matrix O%,O7. Partition 
the orthogonal matrix U as 


Uyy tt Uik | Wiky tt Unyi 

O= Un Unj _ : 
=lar at |7 

21 U2 Unt 


Un+1,1 °° Ungi,k|Yn+1,k+1 °° 4Untijnti 


Unk Unk+ ` Un n+ 








The matrix Q, in equation (4.103) can be taken as the n x k matrix with orthonormal columns 
in the QR factorization Q R = U,,. For this choice, we get the reduced AT’-approximation 


n Yas 1 
AN, = 2 +uj,2,u,, +z, R2,u, + z7 RE,R"Z,, 


which can be factorized again. The steps of Method 2 are summarized in Algorithm 5. In 
some special cases, the matrix U,, may be rank deficient. This is not a serious problem since 
it is easy to show that this leads to an approximation of size (k — 1) x (k — 1) — it is a lucky 
break that earns us the additional reduction of one dimension. 





Algorithm 5 
Dimension Reduction, Method 2 
Input: A AT-approximation. A time series {r;}“, of daily returns. A tolerance 
e>0. 
Output: Parameter estimates for the market model. A factorization of the reduced 
AT -approximation. 
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e Compute estimates for the mean {& and the covariance matrix C. 
e Solve the eigenvalue problem 


A=A’'TA, 
C=AA’. 
e Compute 
B= erp At sa TR, 
A’=A"(A+T@). 
e Compute the Schur decomposition 


A A] T 
Order the eigenvalues so that |o;| > |o;4,\- 
e Find the smallest k such that 


[Oky] < €. 
¢ Compute the QR factorization 
QR=U. 
e Compute 
r, =R»Y,R’, 


A, =R¥,U(n, 1:4), 


= = T 
Hi = H +u Èru. 


e Factorize All, and estimate the remaining parameters from the transformed 
returns, using the methods in Section 4.1. 








4.7.3 Absolute versus Relative Value-at-Risk 


Instead of computing the value-at-risk directly from the value-at-risk equation (4.1) with ATI 


replaced by AIL, we have found it is more accurate to compute the relative value-at-risk, 
defined by 


VaR,,,(AT1) = VaR(ATI) — E[ATI]. 
Instead of computing the value-at-risk from the reduced approximation as 


VaR(ATI) ~ VaR(AII,), (4.107) 


4.7.4 
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we use the formula 


VaR(AII) ~ VaR, (Afl) + [ATI]. (4.108) 


rel 
The motivation for this choice becomes more clear when we consider the errors in the 
two formulas. The error for equation (4.107) is 


VaR(AII) — VaR (AÑ, ) = VaR,. (ATD) — VaR,a (AT, ) + ELAT — ATI, ], (4.109) 
and for equation (4.108) it is 
VaR (ATĪ) — VaR (AĴI, ) — E[ATI] = VaR,,, (AT) — VaR, (ATL, ). 


The extra term E [ATI — ATI, ] in equation (4.109) can be large, while the difference between 
the relative value-at-risk term is still small. Since computing the expectation for the full 
portfolio is easy, separating the terms and using the reduction AIl, only to compute the 
relative value-at-risk is both easy to do and leads to better accuracy. 

The ideas used to develop the two methods for dimension reduction are very different. 
Method | has a direct connection to probability theory; it finds a reduced model with small mean 
square error. Method 2 is based on a linear algebra argument, computing reduced model using a 
low-rank approximation of the matrix for the quadratic form. Method 1 has several advantages 
over Method 2. First, Method 1 is easier to implement and slightly more efficient. Second, in 
Method 1 the structure of the AT -approximation is preserved in the new quadratic form, except 
for the final risk factor, which captures the residual linear term. Finally, the numerical experiment 
in the next section shows that Method 1 gives a more accurate reduced model in practice. 


Example: A Comparative Experiment 


In this section, we present the first of two computational examples. The purpose of the first 
example is to study the performance of dimension reduction and to compare the results from 
the two methods. 

The portfolios in our previous examples have few dimensions. The advantage of dimension 
reduction is to reduce computation time for portfolios with many risk factors. So we consider 
a portfolio with options on each of the stocks in the TSE35 index. The returns for the 35 stocks 
have significant correlation, and we expect that dimension reduction will produce accurate 
simulation results for relatively few dimensions. The portfolio consists of short positions with 
one call option and one put option on each of the stocks. As before, the options are European 
and at-the-money and have 3 months to maturity. Furthermore, we include one independent 
risk factor for changes in volatility, which is shared by all options; see Section 4.6.8 for 
details. Experiments with a similar problem, without volatility risk factor and with normal 
returns, are reported. 

To study the effect of dimension reduction, the daily value-at-risk is computed with 
dimensions k = 1,...,36 (see Figure 4.23). Each graph shows the 99% value-at-risk and 
95% value-at-risk. There is one graph for each of Method 1 and Method 2 applied to the 
three risk factor models. Our expectation, that only a few dimensions essentially characterize 
the risk of such a portfolio, is confirmed by the simulations. 

Figure 4.24 shows the relative errors for the simulations in Figure 4.23. The figures lead to 
some interesting observations. Dimension reduction produces results that resemble the result 
for the full model, but the quality of the result differs for Method 1 and Method 2. In general, 
the result for the 95% value-at-risk seems to be better than for the 99% value-at-risk. In the 
case of normal returns, the error is very small for both methods [see graphs (i) and (ii)]. The 
error for Method 1 is small for both the asymmetric t and the Parzen models [see graphs 
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FIGURE 4.23 Value-at-risk with dimension reduction. The graphs show the 95% and 99% value-at- 
risk as a function of the number of dimensions used in the simulation. The x-axis spacings are marked 
for every five dimensions. 


(iii) and (v)]. Method 2 does not produce accurate results for the last two return models 
[see graphs (iv) and (vi)], and, more seriously, it is not clear that the results improve when 
more dimensions are included. 

To try to understand the failure of Method 2, we examined the intermediate results 
produced during the execution. We believe that the problem arises from a large A component 
that is split over two dimensions in the reduced model, one with positive and one with 
negative curvature. In the normal model, where the model is characterized by the mean and 
variance of the returns, this does not lead to a deterioration of the simulation result. In the 
other two models, the estimation error destroys the balance between the two components, and 


this corrupts the result. This is a serious drawback for Method 2 and one that is not shared 
by Method 1. 


4.7.5 
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FIGURE 4.24 Relative errors for the simulations in Figure 4.23. The solid lines correspond to 99% 
VaR. Observe that the scales for graphs (iii) and (iv) are different. 


To conclude, the value-at-risk computed with Method 1 has a small relative error. Although 
the error does not decrease monotonically, the trend is clear — more dimensions give more accu- 
rate results. As demonstrated by this example, Method 1 produces better results than Method 2. 


Example: Dimension Reduction and Optimization 


We conclude with an optimization example similar to the one in Section 4.6.7. The example 
shows that dimension reduction leads to significant savings in computation time and that 
the accuracy is preserved despite the reduction. Since Method 1 has clear advantages over 
Method 2, we restrict our attention to the first method. 

Consider a portfolio with a short position in one call option on each of the stocks in the 
TSE35 index. All options are European and at-the-money and have 3 months to maturity. 
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Suppose we want to minimize the value-at-risk by buying or selling a combination of a linear 
position in the index, such as the index itself or a future, and a position in a call option 
on the index. Let 6, and 0, be, respectively, the number of index units and call options in 
the portfolio. To see how dimension reduction affects the shape of the value-at-risk surface, 
we computed the 99% value-at-risk for the full portfolio and for a reduced model with five 
dimensions. As Figure 4.25 shows, the differences between the surfaces are small. Of course, 
we are therefore led to believe that the same applies to the optimization problem 


min VaR(6,, 65). 
01,02 


Table 4.1 show the computed solutions to the optimization problem and statistics about 
time and number of value-at-risk computations required by the numerical procedure.” 


Reduced to Five Dimensions Full Model 





Option, 0, Stock, 0, Option, 0, 


(i) Normal returns (ii) Normal returns 





Stock, 0, Option, 0, Stock, 0, 


Option, 0, 


(iii) Asymmetric t returns (iv) Asymmetric t returns 





Stock, 0, Option, 0, Stock, 0, Option, 6, 


(v) Parzen estimate for returns (vi) Parzen estimate for returns 


FIGURE 4.25 The 99% value-at-risk surfaces, as functions of 0, and 0,, computed with the number 
of dimensions reduced to five and for the full model. 


23 The results were created using the quasi-Newton method fminunc in Matlab’s optimization toolbox. 
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TABLE 4.1 Computed Solutions to the Optimization Problem (Quoted Times Are in Seconds of CPU 











Time) 

# Dimensions 0; 05 VaR Time Function calls Time/function call 
1 —1.89 5.31 11.93 8.70 101 0.0861 
2 —1.87 5.30 12.15 12.87 129 0.0998 
3 —1.87 5.31 12.16 14.79 129 0.1147 
4 —1.88 5.32 12.16 15.53 122 0.1273 
5 —1.87 5.31 12.16 18.54 130 0.1426 
6 —1.87 5.31 12.16 13.41 84 0.1596 
8 —1.87 5.29 12.16 24.50 127 0.1929 

10 —1.86 5.27 12.17 29.60 131 0.2260 

12 —1.89 5.34 12.16 30.40 118 0.2576 

17 —1.86 5.27 12.16 33.06 102 0.3241 

22 —1.86 5.29 12.17 31.18 79 0.3947 

27 —1.92 5.28 12.16 38.70 83 0.4663 

32 —1.82 5.19 12.21 50.85 95 0.5353 

37 —1.86 5.27 12.17 77.33 127 0.6089 

(i) Normal returns 

# Dimensions 0; 05 VaR Time Function calls Time/function call 
1 —1.91 5.46 13.15 18.83 85 0.2215 
2 —1.91 5.43 13.30 29.10 102 0.2853 
3 —1.91 5.40 13.55 36.36 100 0.3636 
4 —1.82 5.26 13.61 55.51 128 0.4337 
5 —1.80 5.23 13.62 41.45 80 0.5181 
6 —1.87 5.30 13.56 65.75 111 0.5923 
8 —1.81 5.28 13.59 29.70 39 0.7615 

10 —1.81 5.23 13.58 114.8 127 0.9038 

12 —1.83 5.31 13.57 146.5 142 1.0316 

17 —1.86 5.28 13.57 115.3 83 1.3889 

22 —1.80 5.22 13.60 151.0 87 1.7356 

27 —1.81 5.23 13.62 167.2 80 2.0905 

32 —1.80 5.22 13.60 316.5 129 2.4537 

37 —1.84 5.28 13.60 252.9 89 2.8412 

(ii) Asymmetric t returns 
# Dimensions 0; 05 VaR Time Function calls Time/function call 
—2.05 5.26 14.21 33.50 128 0.2617 
2 —1.85 5.27 14.51 50.69 110 0.4608 
—1.86 5.32 14.70 90.81 140 0.6486 


continued 
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TABLE 4.1 continued 





# Dimensions 0, 05 VaR Time Function calls Time/function call 
4 —1.82 5.18 13.01 69.43 82 0.8467 
5 —1.91 5.27 13.03 73.63 70 1.0519 
6 —2.00 5.00 13.32 153.0 122 1.2543 
8 —1.88 4.96 13.28 153.4 93 1.6495 

10 —1.86 5.16 12.98 108.9 53 2.0555 

12 —1.86 5.15 13.00 365.4 150 2.4357 

17 —1.83 5.16 13.11 210.1 61 3.4434 

22 —1.64 4.98 13.34 725.2 165 4.3950 

27 —1.77 5.10 13.22 749.9 139 5.3947 

32 —1.85 5.18 12.89 562.6 88 6.3928 

37 —1.89 5.27 12.90 601.8 81 7.4291 


(iii) Parzen estimate for returns 


The variations in value-at-risk at the computed minima are small for the normal and asym- 
metric t models, and the changes in the location of the minima are relatively small. Similar to 
the observations in Section 4.6.7, we note that the performance of the minimization procedure 
is less reliable for the Parzen model; the density estimator introduces small fluctuations in 
the value-at-risk. In all three cases, we see that dimension reduction is very effective in 
reducing the computation time per function evaluation. The total time for the optimization is 
also reduced. Although the total time is important, it is not a good indicator, since it mostly 
depends on the success of the stopping criteria used by the optimization algorithm. 


4.8 Perturbation Theory 


4.8.] 


The experiments in previous sections show that the different models for risk-factor returns 
can lead to large differences in the estimate for value-at-risk. Similar observations have been 
made by several other authors; see Beder [Bed95] and Jorion [Jor96]. In this section, we 
derive a perturbation result that describes how value-at-risk changes with perturbations to 
the model for risk-factor returns. It shows that value-at-risk becomes increasingly sensitive 
as the confidence level increases and that the tail is more sensitive than the center of the 
distribution. We present a computable-condition number for value-at-risk and illustrate the 
theory with a numerical example. 


When Is Value-at-Risk Well Posed? 


Value-at-risk is defined as the solution to nonlinear equation (4.1). If p(r) is the pdf for the 
risk-factor returns, then equation (4.1) is equivalent to the integral equation (i.e., equation (4.72): 


Í p(r)dr = 1—a. (4.110) 
{reR”:ATI(r)<—VaR} 
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The integration domain is a subset of risk-factor returns in R”. The coarea formula” gives 


—VaR 
Í Pan (p)dp = 1—a, (4.111) 
where 
P(r) 
= ie 4.112 
Pan(P) | er |DAI(r)|!/2 Gy) 
and 


parini= o() . 


i=1 i 


So the coarea formula transforms integral (4.110) into one-dimensional integral (4.111) over 
a new pdf (4.112). In other words, equation (4.111) is an integral over the pdf for AII, which 
is defined by equation (4.112) as the (n — 1)-dimensional surface integral (i.e., dA is the 
surface differential) over the level sets of AI. Equations (4.110) and (4.111) are equivalent, 
but we find the second form more convenient in our perturbation analysis. 

Hadamard’s classic definition says that a problem is well posed if it has a unique solution 
that depends continuously on the initial data. The properties of pdf (4.112) determine if value- 
at-risk is a well-posed problem. The first condition, existence of a solution, holds without 
additional assumptions. The cdf 


(x) =P(AT <x) = f pyulo)dp (4.113) 


has range (0,1). Hence, since the cdf ® is a continuous function, the intermediate value 
theorem implies that equation (4.111) has a solution for all œ € (0, 1). 

For uniqueness and continuity to hold, equation (4.113) must be strictly increasing in a 
neighborhood around the solution x = —VaR. Equivalently, it is unique if equation (4.112) 
is positive almost everywhere in a neighborhood of —VaR. Uniqueness follows by observing 
that, if there were to be two solutions with VaR, > VaR,, then since equation (4.113) is 
an increasing function, any x with —VaR, < x < — VaR, must be a solution too. Hence, 
equation (4.113) is not strictly increasing, or, equivalently, the pdf is not positive almost 
everywhere, in any neighborhood of a solution. 

Continuity, the third condition for equation (4.111) to be well posed, requires that the 
solution depend continuously on the data. We show that value-at-risk is continuous for changes 
in the pdf p(r). Suppose that the density in equation (4.112) is positive almost everywhere 
in some interval (—VaR — €, —VaR + €). Let {p,}2, be a sequence of pdfs that converges to 
p; that is, |p — p;|ı > 0 as i— oo. Moreover, let VaR; be the solutions to equation (4.110) 
corresponding to p; and some fixed a. Combining equations (4.111) and (4.112) gives 


—VaR o pO —VaR; D; (r) 
d dA= ——_—_———__qdA, (4.114 
Ja 2 fanon Dae AAS, P faino- paia 114) 





24We refer to Evans and Gariepy [EG92] for a proof and discussion of the coarea formula. The formula can be 
applied assuming that ATI is Lipschitz differentiable and ess inf |Df]| > 0. 
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4.8.2 


We then obtain 


—VaR 
| Í Pan(p)dp 
—VaR; 





<f I(r) — p(n) lar < |p- pili- (4.115) 
AII(r)<—VaR; 


Furthermore, since pan(p) is positive almost everywhere in a neighborhood of —VaR and 
the left-hand side of the inequality goes to zero, we must have VaR; —> VaR, as i > oo. So 
value-at-risk is continuous with respect to the return distribution model p. 

Hence, value-at-risk is a well-posed problem, given that cdf (4.113) is increasing close 
to the solution. This condition holds for the market-risk models we consider in this chapter. 
However, for credit-risk models, this condition is often violated and value-at-risk is a ques- 
tionable measure of risk. For a detailed discussion and an axiomatic system of desirable 
properties of general risk measures, see Artzner et al. [ADEH99]. In the preceding analysis 
we examined perturbations of the model for risk-factor returns. In the future, it would be 
interesting to extend the analysis to perturbations of ATI. 


Perturbations of the Return Model 


The foregoing analysis shows that value-at-risk, defined by equation (4.111), is well posed 
if the cdf P(x) is strictly increasing for values of x in a neighborhood of the solution. We 
argued that value-at-risk is continuous for changes in p(r), but the analysis does not indicate 
the size of the resulting perturbations. In this section, we quantify the change in value-at-risk 
for a perturbation of the pdf p(r). 


4.8.2.1 Proof of a First-Order Perturbation Property 


We now derive a variational, first-order perturbation property for value-at-risk. Consider 
a differentiable pdf pan(p) as given by equation (4.112). The set of probability density 
functions is the subset of functions in L! that integrate to 1.” Furthermore, if p and q are 
pdfs, then hp+ (1 — h)q is a pdf for all h € [0, 1]; i.e., the pdfs are a convex set. Consider a 
variation v where 


u(r) 


Danat 4.116 
anwo) |D ATI (r)|"2 (4.116) 


Van(P) = f 


is continuous. In addition, for v to be an admissible variation, the function p+ hv must be a 
pdf for all A in some interval [0, €), € > 0. 
For an admissible variation, the value-at-risk is a function of A that satisfies 





—VaR (h) h 
f dp | PEE a i 
e í 


an(r)=p} |D ATI(r)|/ 


The function on the left side has the form 


P(r) + hv(r) 
a n= [ap Phas =p} |D [Dai A 


25 A function f is in the function space L! if it is measurable and has finite L! norm, i.e., if 


Ifl =f Leoldx < ee. 
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with first partial derivatives 


oF 7 i p(r) + hv(r) (4.117) 
{ 


Ox am(y=x} |D ATI (r)|!/? 


and 


v(r) 
at P no-n DOTA 


Since the density in equation (4.112) is continuous, value-at-risk is well posed for h = 0 
if and only if equation (4.117) is positive for h = 0, 


OF 


Se lie = Pan(x) > 0. 


Assuming that value-at-risk is well posed, the implicit function theorem guarantees that the 
solution VaR(h) to 


F(—VaR(h),h)=1l-—a 


is continuously differentiable for h in some interval [0, 5), where ô< e (see Rudin [Rud76]). 
The derivative of VaR(h) at h = 0 is 


1 


VaR’ 0 = na 
© Pan(— VaR) /an(r)<—var 


v(r)dr. (4.118) 


In the terminology of variational calculus, VaR’(0) is the Gateaux variation in the direction 
of v. Taylor’s theorem gives the linear approximation 


VaR(h) = VaR(0) + VaR’ (0)h + O(h’). (4.119) 


4.8.2.2 Error Bounds and the Condition Number 


Taking the absolute value of linear approximation (4.119), we get an estimate of the 
absolute error, 


|VaR(h) — VaR(0)| < |VaR’(0)|- |A| + O(|h)’). 


Note that derivative VaR’(0) depends on the variation v.” So in general, different variations 
give different derivatives, and we write VaR’ (0) for the Gateaux variation in the direction v to 
emphasize this dependency. Also, since VaR,(0) is independent of v, we write VaR instead. 

There are many possible metrics that could be used to measure the distance between two 
pdfs. We choose to consider the metric induced by the L!-norm, 


d(p, q) = |p— |. 


26 Similarly, the constant in the asymptotic term O(h?) depends on v. Recall that O(h?) denotes a function f(h) 
such that 


For our error bound, C depends on the variation v. 


310 CHAPTER 4. Numerical methods for value-at-risk 


mostly because it leads to an elegant result. We may without loss of generality assume that 
the variations have unit length, since we may rescale v and h simultaneously to achieve this. 
Remember that the admissible variations satisfy the constraint that p+ hv is a pdf for all h 
in some interval [0, €). Since 


| [p@)+hr@)lar=1, heoo, 
R” 
it follows that 
Í v(r)dr = 0. (4.120) 
R” 


The function v = v, +v_ can be separated into its positive v, and negative v_ parts. Since v 
has unit length and satisfies equation (4.120), we obtain 


2 v,(r)dr = = v_(n)dr = (4.121) 


Property (4.121) can be used to bound the integral term in equation (4.118), 


1 
| / u(r)dr|< =. 
AII(r)<—VaR 2 


Therefore, the derivative is bounded by 


IvaR’(0)|< —— 
aR‘ < — 
2Pan(—VaR) 


for all admissible variations v with unit length. The absolute error is bounded by 


[h| 





|VaR, (h) — VaR| < ——-——— + O(h|°), (4.122) 
2Pan(—VaR) 
and the relative error is bounded by 
VaR, (h) — VaR h 
ak) | < A + O(\h|’). (4.123) 
VaR 2VaR Pan (—VaR) 
The condition number of value-at-risk is’ 
1 
(4.124) 





K= ; 
2VaR pan(—VaR) 
Relative error bound (4.123) provides the first-order error estimate 


|VaR,(h)— VaR| z 
VaR i 





k|p— Pli, (4.125) 


where p is the original pdf, p = p + hv is a perturbed pdf, and v = °> P is the direction of 
the perturbation. 





27Note that this is a variational condition number. The standard condition number (see Rice [Ric66]) for the 
problem y = f(x) is defined as 
xf") 
k= =. 
f(x) 
This can be interpreted in a directional sense. Consider f: R” — R, then the absolute error is 


S(x+ Ax) — f(x) © Vf(x)- Ax. 


4.8 Perturbation Theory 311 


4.8.2.3 Example: Mixture Model 


We consider two portfolios that depend on a single risk factor, the stock price for BCE. 
The first portfolio consists of a single stock. The second portfolio is a short position in a 
European call option. The option is delta-hedged with a position in the stock; i.e., a stock 
position has been chosen so that the A is zero. The option is at-the-money and has 3 months 
to maturity. The returns are assumed to be normal, 


1 (r=)? 
po) = =e a, 
V 2702 


Although the theory is valid for a large class of return models, we chose this example for its 
simplicity and for accurate computations. 

Figure 4.26 shows plots with some results of our experiment. The plots are for the stock 
portfolio (for the option portfolio, similar plots were obtained). The plots are for value-at-risk 
with a = 95% and a = 99%. The continuous line shows equation (4.125) as a function of the 
size of the perturbation |p — p|,. We see that the relative error in value-at-risk grows rapidly 
for small errors in p. For the stock portfolio, k can be computed directly. For the option 
portfolio, we used a Monte Carlo method to compute value-at-risk and a Parzen estimator 
for pay (—VaR). The Monte Carlo method used 20,000 random normal samples and variance 
reduction with antithetic variables. We computed the option price for each sample via the 
Black-Scholes formula. 

To compare equation (4.125) and actual relative errors for perturbations of the model, we 
generated random densities of the form 





P= Hipi + Hyp, + (1— H, — M,)p, 
where p, and p, are normal pdfs. The random variables H, and H, are uniform with ranges 
[0, 0.1] and [0, 0.01], respectively. The parameters of p) and p, were also generated at 


random. The parameters were generated as 


u, = u(1+2M,), o, =0|0.5+2U,|, 
H = (1+ 10M3), o, = 0|0.5+ V|, 


95% Value-at-Risk 99% Value-at-Risk 


10° 


value-at-risk relative error 
value-at-risk relative error 


10° f 








IIp-qll, Ilp-qll, 


FIGURE 4.26 The plots show the relative error in value-at-risk for a stock portfolio versus the size 
of the perturbation |p — p|,. The continuous line is the error bound, in equation (4.125). The plus signs 
corresponed to random perturbations. 
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where M, and M, are standard normal random variables and U, and U, are standard uniform 
random variables. 

For each random mixture, we computed the relative error in value-at-risk and the size 
of the perturbation, |p — p|,. Each plus sign in the plots in Figure 4.26 marks the result for 
a randomly perturbed problem. For the stock portfolio, we used 1000 randomly perturbed 
densities. The relative error in value-at-risk is indeed smaller than the first-order, worst- 
case estimate, equation (4.125). For the option portfolio, we computed the value-at-risk 
and the norm |p— p|; with the Monte Carlo method. Since this procedure is much more 
time consuming than for the stock portfolio, we had to limit the experiment to 300 random 
densities. Although some samples are larger than the approximate bound, we conclude that 
equation (4.125) is a good estimate of the relative error. The accuracy of the Monte Carlo 
method is limited, and we see that these plots contain some simulation noise. 

In this section, we have discussed the properties of value-at-risk equation (4.111). In our 
analysis, we argued that in most cases value-at-risk is a well-posed problem. The requirement 
for being well posed is that cdf (4.113) be strictly increasing close to —VaR. An equivalent 
condition is that equation (4.112) be positive almost everywhere in a neighborhood of the 
solution. Credit risk is one important exception where these assumptions will typically not 
hold, but it is a reasonable assumption for standard value-at-risk models. 

Nevertheless, being well posed alone does not guarantee that a small error in the model 
for returns translates into a small relative error in value-at-risk. To understand how such errors 
affect the simulation, a variational perturbation theory was developed. The theory applies to 
problems that are sufficiently smooth and for smooth variations in the model density of the 
returns. The advantage of the variational approach is a theory that is model independent; 
it can therefore be used to quantify model risk. The theory provides estimate (4.125) for 
the relative error where the condition number can be computed. The stumbling block is 
to find pan(—VaR). In some methods, such as the fast convolution method, pan(—VaR) 
is computed. In other methods, for example, a Monte Carlo method, pay(—VaR) must be 
computed with a density estimator; see, for example, [TT90]. 

The condition number, equation (4.124), controls the size of the relative error. The problem 
becomes increasingly ill conditioned as VaR pan (— VaR) decreases to zero. This confirms that 
the empirical observations of small perturbations to the return model — caused by changes 
either in the model, in the data, or in the estimation procedure — that cause large changes in 
the simulation result is an intrinsic property for large œ. Value-at-risk is ill conditioned for 
extreme levels of confidence. 
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CHAP TER «5 


Project: Arbitrage Theory 


The purpose of this exercise is to detect arbitrage opportunities given a payoff matrix and a 
set of asset prices. The arbitrage theorem is analyzed within the simple context of a single- 
period financial model. Some of the basic finance concepts, terminology, and formalism are 
reintroduced in a more practical form. Following this, an example that illustrates the logic 
behind derivative asset pricing within the single-period model is presented. The arbitrage 
theorem is discussed in explicit matrix format for a single-period model with a finite number 
of assets and states. This provides all of the background needed to automate arbitrage with a 
chosen number of states. As is shown, the problems underlying the single-period model are 
simply related to finding solutions to a linear system of equations. 

Worksheet: arb 

Required Libraries: MFioxl, MFBlas, MFRangen, MFLapack 


5.1 Basic Terminology and Concepts: Asset Prices, States, 
Returns, and Pay-Offs 


We let the index ¢ represent time. The first object we introduce is a price vector. That is, all 
securities (options, futures, forwards, bonds, stocks, etc.) are represented by a vector of N 
asset prices, which we can denote simply by p(t): 


a 
a(t 
po], (5.1) 
Py(t) 


The asset price p,(f) can typically represent riskless borrowing or lending, such as a U.S. 
Treasury bill, p, (£) can denote a stock price S,, p(t) a call or put option on the same stock S,, 


etc. In a discrete time series the prices are given by a series of vectors p(0), p(1), ..., p(4), 
p(t+1),.... Note that in a single period model t = present time and T = t+ 1 is the terminal 
time of any trading period [ż, t+ 1]. In terms of the base assets then Ai = p,(t),i=1,...,N. 
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Next we introduce the concept of states of the world. That is, we assume that each possible 
outcome or scenario corresponds to an elementary event, or state of the world, w;, where there 
is only a finite number M of them: i= 1,..., M. These states are mutually exclusive, with 
at least one of them occurring with nonzero probability. All possible states are represented 
by the set Q = {w,,..., @y}-. 

Financial assets will attain different values and give rise to differing payouts corresponding 
to the different states w,;. Shortly we discuss in detail an instructive example. Before that, 
however, we recall a couple of other concepts. One is that of payoffs D; ;, which represent 
the number of units of account paid out per unit of security i in the state j. Generally, for 
an N-asset and M-state system we can represent all single-period pay-offs by an N x M 
dividend matrix for an interval [t, t+ 1]: 


Diy +++ Diy 
D= TE . (5.2) 
Dy, ++: Dym 


This payoff matrix can be interpreted in two different ways. The first is that each ith row 
of the matrix corresponds to pay-offs for one unit of a given ith security in all the different 
states of the world. In the second interpretation, each jth column represents pay-offs for all 
the different assets within a given jth state of the world. 

The other concept of importance is that of a portfolio. Recall from Chapter 1 that a portfolio 
is defined as a linear combination of assets or securities. That is, one can generally have 
positions given by 0; in the ith asset and by specifying all such N positions 0;,,i=1,...,N, 
we have uniquely specified a portfolio as a vector, 


o=| 2 |. (5.3) 


Positive 0; correspond to long positions and negative values correspond to short positions. 
A zero position 0; = 0 implies that the ith asset is not included in the portfolio. A portfolio 
that delivers the same pay-off regardless of any possible state of the world is defined as 
riskless. By taking the dot product of 0 with the asset price vector p, = p(t) we obtain the 
value of the portfolio at time f: 


N N 
V? = 0-p, =} 0p) =} 04i. (5.4) 


i=l i=l 
The payoff V$ (@ j), denoted here by Aj, for the portfolio given by @ in a given jth state 


is then expressible as a sum over all asset pay-offs weighted by their respective positions, 
where D,; = Ai,,(@;), 


N N 
A, =2D,6;, = LD) 0. (5.5) 
i=1 i=1 
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The superscript / stands for matrix transpose. We can therefore express the payoff vector with 


components A;, j=1,...,M, in matrix form, A = D’6, 
A 0 
A, Diy eae Dy 6, 
d= TE od. (5.6) 
Mn Diy ++: Dym 6, 


5.2 Arbitrage Portfolios and the Arbitrage Theorem 


As in Chapter 1, we define 0 to be an arbitrage portfolio, or sometimes simply called an 
arbitrage, if either one of the following conditions applies: 


(i) p,-6=0 and D’-6>0, where (D’@); > 0 for some j. 
(ii) p,-8 <0 and D’.6>0. 


Note that these vector inequalities are meant to be applicable component by component. 
In case (i) the portfolio guarantees a positive return in some states with no possible loss, yet 
costs nothing to purchase. In case (ii) the portfolio will guarantee a nonnegative return and 
has a negative cost to purchase. 

Finally, we can state the arbitrage theorem as follows: 


1. If there are no arbitrage opportunities then there exist positive constants Ww; > 0, i= 
1,...,M (in vector notation we write simply y > 0, where y is the vector of y, 
components), such that 


p, = Dy. (5.7) 
2. If condition 1 is true, then there is no arbitrage. 


One notes that, apart from a positive constant [i.e., the inverse of the discount factor 
as shown in upcoming equation (5.10)], the y; correspond to certain nonzero probabilities 
of occurence for all the states i = 1,..., M. In fact, these coefficients give the risk-neutral 
probabilities for the correct pricing of financial securities, as explained in the following 
section and as was observed in the discrete case of the fundamental theorem of asset pricing 
given in Chapter 1. In matrix form, equation (5.7) reads 


Pı Dii ++ Dim Yi 
; ag ; (5.8) 
PN Dyı ++: Dym Yu 
In the arb spreadsheet assignment we consider the case M = N and the special type of payoff 
matrix 


(1+R)---(1+R) 
Dy, >>> Dim 
a (5.9) 


Dy <+- Dym 
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The first row has all equal payoff values and corresponds to the riskless return on a money- 
market or bond; i.e., pi = p,(t) = 1, with single-period rate of return R. Without loss of 
generality, here we have simply set the bond’s present value to one unit of worth. The first 
row in equation (5.8) of the arbitrage theorem then gives 


V+ Rb = Ly = 1. (5.10) 


The coefficients h, defined here correspond to the risk-neutral probabilities for all possible 
states. In fact, h are recognized as being the q; probabilities used to define the pricing 
measure in the fundamental theorem of asset pricing discussed in Chapter 1. They sum up to 
unity as required and also satisfy the condition 0 < h; < 1. As noted earlier, these probabilities 
are very different from the real-world probabilities, which provide no information on the 
risk-neutral probabilities used for pricing. The risk-neutral probabilities therefore exist with 
the correct properties mentioned if, and only if, there is no arbitrage. 


5.3 An Example of Single-Period Asset Pricing: Risk-Neutral 
Probabilities and Arbitrage 


The single-period setting assumes that time consists of the present time ¢ and a later time 
T = t+ l and that there is a finite time separation. We consider here a portfolio consisting of 
just one bond with present value of unity, B(t) = 1, one asset (or stock) S, and a call option 
C on the underlying stock S. Moreover, we assume only two possible states of the world. 
In this situation the stock, which has present value S(t), can attain either of two values: 
S (t+ 1) or S (t+ 1) at time t+ 1. Accordingly, the option with present value C(t) can take 
on the values given by C,(t+1) or C,(t+1) in state w, and w,, respectively. No matter 
what the outcome, however, the bond has a fixed (riskless) return of 1+ R, with R being the 
single-period rate of return. In this situation we have a 3 x 2 payoff matrix and the foregoing 
arbitrage theorem gives 


1 (+R) (14+R) 7 
S(t) } =] S,(t+1) S&(t+1) ( i (5.11) 
C(t) C\(t+1) C,(t+1) 2 
which implies a linear system of three equations: 
th+%=1, (5.12) 
S(t + 1) + yS,(t +1) = (1+ R)S(0), (5.13) 
pC (t+ 1) + pC (t+1) = (1+ R)C(0). (5.14) 


Here we have used the same definition as before for the risk-neutral probabilities ws; = 
(1+ R)y;. These equations have the familiar form of the binomial pricing equations for 
options, as discussed in the project that deals with binomial lattice pricing. That is, the price 
today of a security is given as the discounted sum of the risk-neutral expected payoff values 
for all possible future values of the security. We also note that if we allow for three states of 
the world, we then obtain pricing equations that resemble the trinomial pricing equations. 
To demonstrate an example of arbitrage, let us consider R = 7% and the two possible 
values at time t+ 1: S (t+ 1) = 50 dollars , S,(t+ 1) = 150 dollars, where S(t) = 100 dollars. 
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Say the call option C has strike price of 100 dollars and expires exactly at time t+ 1. This 
option then has pay-off of zero and 50 dollars, respectively. If we denote the price of the call 
option today as C, then equations (5.12) to (5.14) give 


hth =l, (5.15) 
0.5yh, + 1.5) = 1.07, (5.16) 
50, = 1.07C. (5.17) 


By satisfying the first two equations we actually obtain the arbitrage-free price for C by 
substituing the resulting risk-neutral values A = 0.43, ih = 0.57 into the third equation. 
The correct (no-arbitrage) price is therefore C = 26.6355 dollars. If, however, we are given 
a market price for C = 25 dollars and we wish to answer the question of whether there 
is arbitrage or not in this case, then we solve equation (5.17), giving bh = 0.535, and 
then equation (5.16) gives J, = 0.535. These values, however, do not satisfy probability 
conservation equation (5.15), therefore, one concludes that there is indeed arbitrage at that 
market price. 


5.4 Arbitrage Detection and the Formation of Arbitrage Portfolios 
in the N-Dimensional Case 


The preceding example involves an overdetermined system of linear equations. Now, however, 
we shall consider a uniquely specified system where the number of unknowns is equal to the 
number of equations. Hence, we consider the case of N states and N assets, i.e., M = N. 
We shall assume that one of the assets always corresponds to a bond with fixed rate of 
return R. The payoff matrix has the form given in equation (5.9), where the first row has 
all equal elements of value (1 + R). The corresponding system of N equations is given in 
equation (5.18). The problem is then the following. Generate an arbitrary price vector in 
one of two fashions: Set p,(t) = 1 and then generate independent components p;(t) (i > 2) 
distributed either (i) uniformly as integers lying within some given minimum and maximum 
integer values or (ii) continuously using some standard normal distribtuion, say, p;(t)eN(0, 1) 
(i > 2). Similarly, generate N(N — 1) arbitrary payoff matrix elements D;; (i > 2) in the 
discrete or continuous cases, respectively. The numerical library called MFRangen is useful 
for random-number and random-matrix generation. For a given generated pair of price vector 
p(t) and payoff matrix D one obtains the vector of risk-neutral probabilities ý = (1+ R)p 
by solving the linear system 


1 (1+R)---(1+R) wy, 
P2 Dy, ++- Don . 
A = ; a : A (5.18) 
Pn Dy, +++ Dyn Wy 


In practice, one can solve this system numerically by inverting the payoff matrix using a 
routine based on the singular value decomposition. Note that the first equation in the system is 
that of probability conservation [this is equation (5.10) with M = N]. Arbitrage then exists if 
the solution gives at least one nonpositive component, that is, if for any given i, wy; < 0 (since 
we are enforcing probability conservation). For every such i we then have a corresponding ith 
state, which we can use to form an arbitrage portfolio that we denote by 6 with components 
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6... 0. According to the discussion on equations (5.5) and (5.6), then, the payoff vector 
corespondiie to the ith a alone can be obtained by setting only the ith component to a 
nonzero positive value, A® = 1 for j =i (i.e., picking a number greater than zero) and setting 
all other j components to ae Note that this corresponds to the pay-off of an Arrow—Debreu 
security, yet with nonpositive initial value. The transpose of this N-dimensional pay-off 
column vector, denoted by A®, has a row representation of (0,...,0,1,0,...,0), where 
unity occurs in the ith position only. The arbitrage oe is thew obtained by solving the 
linear system of N equations in the N unknowns 0; „j=l, , N, as in equation (5.5) or, 
in matrix form: 


D-8 =A. (5.19) 


To obtain more arbitrage portfolios, one can repeat the preceding steps for the other state 
components that led to arbitrage, i.e., for the other nonpositive ~%; components. To see 
why @ is an arbitrage portfolio note that p, = Dw. So the , portfolio has present value 
ve = 0® p, = (D'6®).y = A® h = h. Since p; < 0, then yo” < 0 and by construction the 
terminal value or pay-off of this portfolio is given by v2 (o J= AO = = | (hence greater than 
0) when w; = @;, yet VE (o; ,) = 0 (hence > 0) for all other states. From the definition of 
single- perad arbitrage we conclude that the portfolio 6 is indeed an arbitrage. 


CHAPTER #6 


Project: The Black-Scholes 
(Lognormal) Model 


The purpose of this project is to develop pricing routines for plotting and analyzing the 
Black-Scholes price for European calls, puts, and butterfly spreads as well as for the corre- 
sponding sensitivities — delta, gamma, rho, vega, and theta — as a function of the five basic 
parameters that make up the plain-vanilla Black-Scholes pricing formula. 

Worksheet: bs 

Required Libraries: MFioxl, MFFuncs, MFStat 


6.1 Black-Scholes Pricing Formula 


The celebrated Black-Scholes pricing formula is quite straightforward since it makes use of 
the standard normal distribution. Building the necessary Visual Basic code for this spreadsheet 
will, however, quickly familiarize the user with the use of ActiveX numerical library methods 
for input/output to Excel. One of the features of the spreadsheet is to allow the user the 
flexibility of inputting any values for the fixed parameters while also allowing a choice for 
the range of plotting. 

Although symmetries of the Black-Scholes formula can be used to reduce the number of 
dependent functional parameters, the price of a call option can be most explicitly written (as 
seen in Chapter 1) as a function of five variables (or parameters): the interest rate r (assumed 
constant), the stock price S, the time to maturity T = T — t (t = current calendar time and T = 
maturity calendar time), the volatility @ (assumed constant), and the strike price K. The 
Black-Scholes formula for the value of a plain-vanilla European call option is 


C(S, K, r, ©, T) = SN(d,) — Ke" N(d_), (6.1) 


where 





a 


+ = (log(S/K) + (r+ 30°)7)/(aV7), (6.2) 
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where d_ = d} —o./7t. The function N(x) is the cumulative standard normal distribution 
at x. 

As an example of the functionality built into the bs spreadsheet, a plot of the value of 
a call option C as a function of S in the range Smin (the minimum spot price) to Smax (the 
maximum spot price) is generated via equation (6.1) while holding r, K, 7, and o fixed. 
A plot of the option price as a function of varying the interest rate while holding the other 
four variables constant is generated in a similar manner. The same plotting functionality is 
also generated for varying volatility, time-to-maturity, and strike price while simultaneously 
making use of the Black-Scholes formula at appropriate interval points. The interface for the 
bs spreadsheet also allows for the choice of plotting a variable input number of points for 
each graph. 

Put-call parity 


P=C-S+Ke" (6.3) 


can also be used to study the corresponding prices and sensitivities of puts. The dimensionality 
of the variables is worth emphasizing and is as follows. Volatility refers to a per annum (i.e., 
yearly) time scale and has units of year~'/*. Maturity is in years, so 0/7 is dimensionless. 
The interest rate is per annum and has units of year~!, making rr dimensionless. Both strike 
and spot are in units of currency (e.g., dollars). One noteworthy property of the Black-Scholes 
formula is its so-called numeraire invariance. This essentially implies that prices can be made 
dimensionless so that the formula is invariant with respect to the underlying currency. This 
is easily seen by dividing equation (6.1) throughout by the strike, giving 


C/K = (S/K)N(d,)—e-"N(d, — 0V7), (6.4) 


where d, is also a function of the dimensionless quantity S/K. 

From the vanilla call or put options one can construct many other options with various 
payoff structures, as was discussed with the theory of static hedging in Chapter 1. One impor- 
tant pay-off that was discussed explicitly is the butterfly spread, as given by equation (1.228). 
Here we reconsider this option, with pay-off defined in a similar manner except for a trivial 
normalization constant. Namely, the pay-off is peaked at strike K and has a nonzero width 
of 26K. This pay-off is statically replicated by taking a long position in a vanilla call struck 
at K+ 6K, another long position in a vanilla call struck at K — 6K, and two short positions 
in a vanilla call struck at K: 





A(S) = (S—(K + 8K)), + (S — (K —8K)), —2(S—K), 
_ [(S-(-6K),, SK, 


~ | (K+8K)-5S),, S>K. (6.5) 


Note that from put-call parity one can also construct such a pay-off with a combination of 
puts. The exact analytical expression for the Black-Scholes price of such a butterfly contract 
maturing in time T is hence 


Bsx(S, K, r, 0, 7) = C(S, K+ 6K, r, 0, 7) + C(S, K — ôK, r, 0,7) (6.6) 
—2C(S, K, r, 0,7), 
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FIGURE 6.1 Price variations as volatility changes for an (a) in-the-money (spot = 100) versus 
an (b) out-of-the-money (spot = 80) European butterfly option with fixed spread AK = 10, strike 
K = 100, r= 5% per annum, 7 = 1 year. Plot (a) is monotonically decreasing, whereas plot (b) 
displays a pronounced maximum, as is expected within a lognormal density model for the stock 
movements. 


where the call formula is given by equation (6.1). Figure 6.1 gives an example of the results 
of the bs spreadsheet application obtained for two cases of chosen spot. As observed, the 
plots illustrate the differing effects of volatility on the price of a relatively narrow butterfly 
spread option for in-the-money versus out-of-the-money (below strike) options. 

The observed changes in the option prices as one changes a parameter, such as volatility, 
time to maturity, spot, interest rate, or strike and shape of the payoff function, can be 
qualitatively understood by means of the risk-neutral pricing formula. Let us generally denote 
by V(S, K,r,o,7) the option price for a payoff function A(K, S). This pay-off can, for 
instance, represent either a call, put, or butterfly spread struck at K. Note that for the case of 
the butterfly the pay-off is, of course, also a function of the spread 6K. As stated in Chapter 1, 
the risk-neutral pricing formula gives 


V(S, K, r, 0,7) = e" f P(S., S: TAK, S,)dS,, (6.7) 
0 
where p(S,, S; T) is the lognormal transition probability density [i.e., equation (1.165)] 


1 1 2) 2]2 9,42 
PS, S; T) = eT o8lS/S-)+0 -30 Jr] /20 T (6.8) 
oS, N TT 


T 


Observe that the density p is actually a function of 0/7 as well as rr. The interest rate gives 
rise to part of the drift of the center of p. The quantity o./7 gives a negative contribution 
to the drift. More importantly, however, 0/7 determines the width (or standard deviation) 
of the density. A direct interpretation of equation (6.7) shows that higher option prices 
correspond to situations for which there is maximal overlap between the density p and the 
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FIGURE 6.2 Variations in overlap between the risk-neutral pricing density and the pay-off, as functions 
of S,, for an (a) in-the-money (spot S = 100) versus an (b) out-of-the-money (spot S = 80) European 
butterfly option with spread AK = 10 and strike K = 100. The interest rate r = 5% per annum, and time to 
maturity T = 1 year. In both cases, the three lognormal density curves correspond to o = 5%, 15%, 35% 
with horizontal axis as final stock level S,. 


payoff function A, and vice versa. For smaller values of 0/7 (i.e., smaller volatility values 
for fixed time-to-maturity or smaller time-to-maturity values for fixed volatility), the density 
is more highly concentrated and is centered about the spot S. Figure 6.2 shows the changes in 
overlap between the lognormal transition probability density and the butterfly pay-off struck 
at K = 100 (6K = 10). Note that in order to keep the two functions on the same scale, the 
pay-off has been multiplied by a normalization 1/(6K)’, giving unit payoff area with height 
1/6K. Increases in the volatility parameter o correspond to more dispersion in the density, 
hence giving less and less overlap with the butterfly pay-off in the (in-the-money) case where 
the spot is at strike, S = K. For moderately out-of-the-money cases, increases in volatility 
lead to a greater overlap for lower values of ø (i.e., from o = 5% to 15%), approaching a 
maximum at an intermediate value, followed by a decrease in overlap at relatively higher 
values (i.e., from o = 15% to 35%). This argument is consistent with the price variations 
observed in Figure 6.1. One can use the same reasoning to obtain the qualitative picture of 
price variations one would expect in other circumstances. Another example, for instance, is 
the case of a deeply out-of-the money butterfly option whereby one expects a monotonically 
increasing price as function of o over a wider range of ø values, with sharper increases at 
lower ø values. Note that our overlap analysis can be applied to other pay-offs, such as calls 
and puts. 
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6.2 Black-Scholes Sensitivity Analysis 


Sensitivities of option prices with respect to changes in the underlying parameters r, T, S, 0 
were also discussed in Chapter 1. As noted, these are of importance to hedging and computing 
risk for nonlinear portfolios. Within the Black-Scholes formulation, these sensitivities are 
obtained analytically by taking the respective partial derivatives of the option-pricing formula. 
The A, p, ©, and the vega, 0V/do, of an option give the change in the option’s price V with 
respect to changes in spot S, r, T, and ø, respectively. The other sensitivity of interest is I, 
which gives the change in A with respect to a change in S. 
For a vanilla call with price C one can readily derive the following sensitivities: 





dC 
=S EMA) (6.9) 
32C —d? /2 
PSS (6.10) 
aS? S277 
C 2 
Pe = — = Kre" N(d_), (6.11) 
or 
ôC -n 
= = Sy T/2me H., (6.12) 
o 
The Black-Scholes PDE can be used to give 
OV a 
Ora = S°/2)P'+r(SA—V) (6.13) 
T 
for any European-style option with value V. Hence, 
dC ia 
0, = P S*/2)T.+r(SA,—C), (6.14) 
T 
with A, and T, given by equations (6.9) and (6.10), respectively. 
The sensitivities for a vanilla put follow from put-call parity: 
A,=A.-1, (6.15) 
meal, (6.16) 
Pp= Pe- Kre”, (6.17) 
ðP ðC 
—=—, (6.18) 
do ôo 
2.2 
0, = (0° S°/2)T,, +r (SA, — P), (6.19) 


with A, and T, given by equations (6.15) and (6.16), respectively. 
The sensitivities for the butterfly spread option follow trivially by differentiation of 
equation (6.6) and the use of equations (6.9)-(6.14), giving 





A,(K) =A,(K +6K)+A,(K — ôK) —2A,(K), (6.20) 

T,(K) =[.(K + 6K) +T.(K — 6K) —20.(K), (6.21) 

Ps(K) = p.(K + 6K) + p.(K — 6K) — 2p.(K), (6.22) 

OB(K) = ôC(K + 6K) f ðC(K — 6K) poe) (6.23) 
00 do do do 


©,(K) = 0,(K + 8K) + ©,(K — 8K) —20,(K). (6.24) 
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Note that in these formulas, the explicit dependence of the option sensitivities as functions 
of the strike is depicted. Hence, equations (6.1), (6.3), (6.6), and (6.9)-(6.24) are used to 
generate all option prices and sensitivities required within the Black-Scholes spreadsheet bs. 
The numerical library MFStat is useful for computing the cumulative normal distribution 
function. 


CHAPTER q 


Project: Quantile-Quantile Plots 


The purpose of this project is to visualize kurtosis in risk-factor return distributions by means 
of quantile-quantile plots. The test cases include equity indices in 40 different currencies. 
Worksheet: qq 
Required Libraries: MFioxl, MFFuncs, MFStat, MFSort 


7.1 Log-Returns and Standardization 


Historical data series are provided in table format for the weekly returns on 40 different 
indices (e.g., the TSE100COMPX denotes the TSE100 composite index, SP500C denotes the 
Standard & Poor 500). The objective here is to create histograms for the P&L on the log- 
return time series for a given choice of index as well as plot the q-q (i.e., quantile-quantile) 
plot for the estimated cumulative distribution against the standardized log-returns on the same 
index. This allows one to display and study the deviations of the actual distributions from the 
standard normal distribution for the log-returns. 
The log-returns over a time period dt at time f are defined by 


m log(S, ;/S;_ar,i) (7.1) 


for each index i. The value S,; corresponds to a price for index i at time t. To standardize 
the returns, we first estimate the mean using 


Bs: log (= 7 -) lic, (1.2) 
t—dt,i 








Neet k=1 
over all N, return dates t,, and secondly estimate the standard deviation o; 4, using 

(O; a)? = Ele) — (EL). (7.3) 
where 





Art = go For (SP) hy (74) 


Neet ka 1 
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Note: More precisely the usual factor of 1/(N,..— 1) is used when computing sample 
standard deviations, however the return series are large (on the order of 100 — 1000) and 
using this factor instead of 1/N « is immaterial for the present calculations. 

Next, we make use of the known result: If a random variable x is distributed as N(u, ©), 
then the variable y = (x — w)/o has standard normal distribution M(0, 1). In order to com- 
pare the actual return series on an equal footing with the corresponding standard nor- 
mal distribution, we standardize the return variables by considering the random variable 
defined by: 


ee (7.5) 


Note that if the return series were normally distributed, then ý; ~ N(0, 1). As observed next 
using a quantile-quantile analysis, however, actual return series are generally not normally 
distributed. 


7.2 Quantile-Quantile Plots 


To obtain the P&L and the quantile-quantile (q-q) plot for a given index we proceed by 
sampling the weekly log-return data for that index. Note that the data on the qq spreadsheet 
is given in terms of the standardized log-returns; i.e., the data corresponds to the foregoing j,. 
Having sampled the data, the cumulative distribution in the variable y, is then estimated by 
sorting and counting occurrences, N,, within subintervals (x,_,, xz), where we divide up the 
P&L range of values into n regions: x) = y"™", x; =X) + dx, ..., X, = Xo +n(dx) = ym™ 
The quantity dx = (ym — y™")/n is the spacing over the n subintervals. One can then plot 
a histogram of the P&L by plotting N, against the midpoints (x,+.x,_,)/2 for all points 
given by k=1,...,n. Figure 7.1 shows an example of two such histograms. Note that the 
histograms have been created by eliminating extreme outliers. The actual (i.e., the realized) 
cumulative distribution F of the standardized return jy, at points x, is then estimated by 








1 k 
By, = Fx) © Ne LN, (7.6) 

(a) (b) 
1.00 1.00 
0.90 4 0.90 4 
0.80 4 0.80 4 
0.70 4 0.70 4 
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FIGURE 7.1 A comparison of return histograms: (a) SPINDMV and (b) SP500C, for time series 
during the period Jan. 1980 to Feb. 1999. The number of bins is set to 50. Note that the histogram 
densities are normalized to give an area of 1, with the returns in percentage units. 
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FIGURE 7.2 A comparison of quantile-quantile plots computed using the weekly log-returns of two 
indices: (a) SPINDMV and (b) SP500C, for time series during the period Jan. 1980 to Feb. 1999. The 
return distribution for series (a) shows a greater deviation from normality with a thinner tail to the left 
and a fatter tail to the right of the P&L. 


where N,,, is the total number of dates for which data is available on a given index i (i.e., 
the length of the return time series for index i). The percentiles 6, are then plotted against 
the percentiles a,, for all parameter values k. The latter percentiles a, correspond to those 
of the standard cumulative normal distribution at x,, i.e., a, = N(x = x;,). 

The results for the quantile-quantile (q-q) plots are used to demonstrate the deviations from 
normality for the log-returns of the realized distribution. The distribution for a given particular 
time series may show a more pronounced deviation when one compares the corresponding q-q 
plot with that of another time series, as shown in Figure 7.2. Fatter or thinner tails will skew 
the otherwise-straight-line q-q plot. The MFStat numerical library is useful for computing 
cumulative and inverse cumulative normal functions. 
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CHAPTER *8 


Project: Monte Carlo Pricer 


This purpose of this project is to implement calibration and pricing of basket equity options 
within a Monte Carlo simulation. The calibration combines implied volatilities with historical 
correlations. A multidimensional correlated lognormal distribution is used as the model for 
the equity returns. 

Worksheet: me (uses parts of qq as input) 

Required Libraries: MFioxl, MFBlas, MFLapack, MFFuncs, MFRangen, MFZero 


8.1 Scenario Generation 


Let us consider a group (i.e., basket) of n stocks (or indices) with prices (or levels) denoted 
by S;;,i=1,2,...,n, at maturity time T. Given an initial price vector Sọ = ($0,1; - - -> So,n) 
a standard method of generating correlated Brownian motion for the stock prices then follows 
from (see Section 1.6): 


Sri = So i exp{(r — to7)T+VT>~ Upixy}- (8.1) 


k=1 


Here r is the risk-free interest rate and g; is the volatility with respect to the ith stock 
price. These quantities are assumed constant in equation (8.1). The set of variables x,, k = 
1,2,...,m, is made up of 1.i.d. random variables drawn from the standard normal distribution 
N(O, 1). Matrix U is used to introduce correlations among the stock prices, as shown in 
detail later. Note, however, that equation (8.1) assumes time-independent volatilities. For the 
time-dependent case, the foregoing can be extended by considering small time increments dt 
and writing 


Siar i = Spi exp{(r— 4o,(t)’)dt+ Vdt> Uy Xx}. (8.2) 
k=1 


To generate the stock prices, this equation is then applied M times from any initial time, say, 
t = 0, to final time t = T = M dt (over M steps) while using the time-varying volatilities. Note 
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that equation (8.2) is, of course, also valid for time-independent volatilities. Throughout this 
project, however, we shall assume time-independent volatilities for implementation. Using 
equation (8.2), with time-independent o;, we can relate the correlations of the standardized 
log-returns to the U matrix: 


log(Si4.ar,i/ Sri) = (r ~ 50; dt 
vdt 


where superscript / is the matrix transpose and the vector x has components x,. The y, 
components are closely related to the standardized log-returns y,, as defined within the 
quantile-quantile project. Time series for these quantities can therefore be obtained from the 
qq spreadsheet. The y; variables have correlation matrix elements 





= (U'x); = y;, (8.3) 





C 
i = Corr (y; y;) = Le 8.4 
Pij (9) CC, (8.4) 
in terms of the covariance matrix elements C;; = E[y;y;], with E[] being an expectation over 
the underlying probability distribution. The covariance matrix of standardized log-returns 
is then: 


Cov(y;, y;) = Ely,y;] = X Ui Uj EL x, x/] 
kl=1 


= } , Uu Ux; = (UV); = Cj, (8.5) 


k=1 


since E[x,x,| = 6, [i-e., the x, are independent standard normals, x, ~ N(0, 1)]. This shows 
that the U matrix used to generate correlated stock price movements is obtained from the 
Cholesky factorization of the covariance matrix. One also observes that uncorrelated stock 
price motion follows readily in the case of C,; = 6,07 = 6,;C;j, i.e., Uj = 6,0; = diy Cu: 
We also have the useful result that 


Net e Cov(y;, y;) 
Cov(y;, y;) = = Pij- (8.6) 


y GiGi 


8.2 Calibration 


In the me spreadsheet application, the first phase is to calibrate the scenario-generation engine 
to be used later for Monte Carlo pricing. This is accomplished by considering a basket of 
options with known market prices on plain-vanilla calls. The second phase, discussed in the 
next section, is to price the basket option of choice by running a Monte Carlo simulation 
based on the calibrated volatilities as input. The spreadsheet table for the calibration basket, 
duplicated in Figure 8.1, shows that for each ith stock (or index), we have a market plain- 
vanilla call option price C; on a single underlying equity i with fixed spot Sy; = $100 (for 
example), present calendar time t (e.g., today’s date), given maturity 7,, and strike K;. From 
this we extract an implied volatility o/ for each underlying index i independently. This is 
done by inverting the Black-Scholes formula for a call with o; = o/, 


Market Call Price; = C(So,;, K;, r, 0}, T;— f), (8.7) 
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99.22 22-Jan-2000 5.18 12.850% 
99.59 12-Feb-2000 5.44 13.003% 
105.67 5-Feb-2000 2.85 15.075% 
106.64 14-Jan-2000 1.74 13.508% 
101.54 9-May-2000 6.28 13.783% 
95.48 12-Feb-2000 7.84 9.793% 
104.72 15-Jan-2000 2.03 11.927% 
96.04 31-Jan-2000 7.38 11.443% 
100.14 14-May-2000 7.98 16.362% 
101.15 6-May-2000 5.70 15.578% 

















FIGURE 8.1 Calibration basket for 10 indices. All implied volatilities are computed with interest rate 
r = 7%, spot 100, and present date 1-Sept-1999. 


for each call contract in the calibration basket. Once the ol are obtained, the covariance 
matrix of log-returns for the total number n of underlyings is estimated using the historical 
returns in the qq spreadsheet. That is, we estimate the correlation from equation (8.6) using 
the average 


1 EWW 
ON (8.8) 
1 


ret k= 


Pij” 


over the total number of historical returns N,,, contained in the time series table of the qq 
spreadsheet. Note that superscript (k) denotes the standardized return at time tą, and the a 
are given by equation (7.5), where t = t,. Equation (8.8) gives the correlation matrix. Note 
that volatility varies as 1/v time, whereas covariance matrix elements vary as the square of 
volatility (i.e., as 1/time). Equation (8.8) is very useful as it stands, since the matrix elements 
are dimensionless and hence do not depend on the time scale of the returns (i.e., these can be 
daily, weekly, yearly, etc.). 

The calibrated covariance matrix is then obtained by using the correlation matrix in 
conjunction with the yearly implied volatilities in equation (8.7). The covariance matrix that 
is actually used for the Monte Carlo sampling, and hence used for pricing as discussed in the 
next section, is given by 

Cij = Pi} 0}. (8.9) 
Note that from use of equation (8.1) the time scale of the covariance matrix is automatically 
set by the unit used for the implied volatilities, i.e., yearly. 
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The price V(S,, t = 0) of a basket option at present time t = 0 and present stock price vector 
So = (So,1> - - -> So,n) With maturity t = T can be expressed as a closed-form n-dimensional 
integral. In particular, the transition probability density function for an initial stock vector 
Sọ to attain value Sy = (S;,,...,S;7,,), in time T, is given by an n-dimensional correlated 
lognormal distribution [i.e., equation (1.198)]: 


P(Sr, So; T) = (27T)? (det C)~? exp (— +z- C7! - z), (8.10) 
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where the n-dimensional vector z is defined by the components 





e log(S7;/So,i) == 50;)T (8.11) 

JT 
and g; = g? (the implied volatilities). Note that these components are essentially the y; defined 
in equation (8.3), with dt replaced by T. The covariance matrix C is given in terms of the 
correlation matrix and the implied volatilities via equation (8.9). Risk-neutral pricing then 
gives [i.e., equation (1.187)] 


V(Sy,0) = e~" [ P(S;, So; T)IU(S,) dS», (8.12) 


where II is the payoff function. 
For a Monte Carlo implementation it is useful to rewrite the n-dimensional integral in 
equation (8.12) using a change of variables defined by z = U’x, i.e, 


= 2 Opi Xx. (8.13) 


k=1 


where matrix U is obtained from the (upper) Cholesky factorization of the covariance matrix 
with elements given in equation (8.9): C = U'U. The Jacobian of the transformation Sy > x 
is T2/det C, while for the inner product we have z-C~!-z=x-x. Note that the inverse 
transformation Sy = §,(x) is given by equation (8.1). Combining these results with the 
integrand in equation (8.12) gives the pricing formula as a discounted expectation over the 
uncorrelated n-dimensional standard normal distribution: 


e 


VS.) = Goan 


| 7 o-EBPTT(S..(x))dx 


wet > T(S,(x)). (8.14) 


S j=l 


This sum gives the Monte Carlo average of the pay-off evaluated at each ith scenario vector 
S,(x), i.e., the stock price vector with components given by equation (8.1), where the 
xð are n i.i.d. standard normal deviates for all N, scenarios. The MFRangen numerical 
library is useful for generating the standard normal deviates, while MFBlas can be used for 
matrix-vector multiplication in the scenario generation. 

Within the me spreadsheet we consider the pricing of three types of basket options, as 
entered within the user interface. These have the respective pay-offs 


(i) Simple chooser: II(S;) = max{S;,;;i=1,...,n} 
(ii) Chooser call: II(S;) = max{C, = max(S;;— K,0):i=1,...,n}, corresponding to 
the choice of one underlying that gives the maximum call pay-off. 
(iii) Chooser put: TI(S+) = max{P; = max(K — S;,,0):i=1,...,n}, corresponding to the 
choice of maximum put pay-off. Note: The strike K is also a user input. 


Figure 8.2 shows the results of a Monte Carlo simulation for pricing a simple chooser 
option on a basket of 10 stocks. Fairly good convergence is obtained in the range of 5000 to 
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FIGURE 8.2 An example of the convergence pattern of an actual Monte Carlo simulation for the price 
of a simple chooser option on a basket of 10 correlated stocks. 


10,000 scenarios. Note that the spacing in the x-axis scale is not constant since the increments 
were chosen using an exponentially increasing number of points. The user is encouraged to 
experiment with pricing various contracts that are in-the-money, at-the-money, and out-of- 
the-money for a varying number of total stocks in the basket. Whenever possible, compare 
the results of your Monte Carlo simulations with exact results, as in the special case of two 
correlated underlyings. 
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CHAPTER 9 


Project: The Binomial Lattice Model 


The purpose of this project is to build a binomial lattice model to price both European and 
American puts and calls. We demonstrate how to parameterize the lattice in terms of a drift 
and a volatility parameter, adjust the drift to match forward prices, and adjust the lattice 
volatility in such a way as to match the price of an at-the-money European call option. Once 
calibrated, the binomial lattice is used to price European and American options. Extensions 
to Derman—Kani trees are left to the interested reader. 

Worksheet: bin 

Required Libraries: MFioxl, MFBlas, MFFuncs, MFZero, MFStat 


9.1 Building the Lattice 


A binomial lattice is a recombining two-dimensional tree with a total number of time steps 
M > 1 over the time interval [0, 7]. Lattice nodes parameterize stock prices and calendar 


time. Dates are denoted by t„, m = 0,1, ...,M, where tọ is the date at which we seek the 
price and t¢,, = tọ +m At, where At = (T —1))/M is the elementary time step. At the mth 
time step of size At, there are (m+ 1) nodes labeled by an index n = 0, 1, ..., m. The stock 


price at node (m, n) is given by 
Sr = d"u" So, (9.1) 


where u > 1 and d < 1. The value S? = Sy is the spot price at the current time ¢ = tọ when the 
option is valued. Figure 9.1 depicts the binomial lattice geometry. The model is characterized 
by the parameters d, u, At and by the risk-neutral probability p of an upward jump. An 
upward move corresponds to a multiplication by u, whereas a downward move corresponds 
to a multiplication by d. The parameter p is strictly between 0 and 1. 

According to pricing theory covered in Chapter 1, arbitrage-free prices are achieved if the 
discrete stochastic process defined by the binomial lattice is risk neutral. One-period returns 
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FIGURE 9.1 A binomial lattice originating at the current time t = fọ with stock level S? to final 


time ty = T. At every time slice ¢,,_, a grid point S’”- gives rise to two points, S” = uS™—' and 


S™ , = dS”, at a later time t, =t,,-; + At. 


n-1? 


on the stock must equal the return on the prevailing risk-free rate r. Assuming r constant, we 
find that the condition 


puS +(1—p)dS=e™'S (9.2) 
must be satisfied at all nodes S = $}. Hence, 
put+(1—p)d=e™. (9.3) 
Let us introduce a lattice volatility parameter o by means of the following equation: 
pu + (1— p)d? = ert ar, (9.4) 


Proposition 9.1. Jn the limit as At > 0, the lattice volatility converges to the continuous-time 
lognormal volatility in the Black-Scholes model. 


Proof. For a lognormal distribution, we have 
SH = S'exp{(r—$0°)At+ovArx}, x~ NO, 1), (9.5) 


where S' denotes a stock price at time t; and At = t;,,; — t;. Conditional on the stock price 
being S’ at time t;, the following expected values at a later time ¢,,, = t; + At obtain using 
equation (9.5): 


ESH] = Sie”, (9.6) 
E[(S+54] — (Set, (9.7) 

Within the binomial lattice we have instead: 
E[S] = (pu+ (1 — p)d)S', (9.8) 
ESH Y] = (p? + (1 = p)d*)(s'y’. (9.9) 


Equating the variances E[(S'*!)?] — (E[S'*'])? and E,[(S'*!)?]— (E,[S'*1])?, and using equa- 
tion (9.3), gives equation (9.4). O 
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Equations (9.3) and (9.4) allow one to parameterize the three unknowns d, u, p in the 
binomial lattice by means of the risk-free rate r, the lattice volatility ø, and a third degree 
of freedom. To resolve the indeterminacy we are at liberty to choose another constraining 
equation. Two choices are popular: 


1 
=- 9.10 
P=5 (9.10) 
and 
1 
=. 9.11 
u=- (9.11) 


For the case p = h, the lattice parameters can be expressed as follows in terms of a lattice 
volatility ø and drift r: 


jŠ e(1 a ertat i) (9.12) 
umen (i4 veni), (9.13) 


p= (9.14) 


This is a recombining binomial tree that drifts upward in the stock price direction. 
If u= L, the tree is symmetric about the line S = Sọ with zero drift and the lattice 
parameters are given as follows: 


d=a—-Ẹv4@-—], (9.15) 


u=1/d, (9.16) 
p=(e™ —d)/(u—d), (9.17) 

where 
a = (e™™ pete”), (9.18) 


9.2 Lattice Calibration and Pricing 


Prices of European-style options are computed iteratively, starting from the maturity date T, 
where the payoff function (S) is ascribed to the terminal nodes S¥, n = 0, 1, ..., M. Let 
f? = V(S", t,,) be the option price at the node S’”. For a call option with strike K, the final 
time condition is 


f” = (S) = max(S" — K, 0). (9.19) 
For the put struck at the same level, the condition is instead 
fi" = $(SM) = max(K — Si", 0). (9.20) 


The risk-neutral condition on the option price process applied to each node yields the following 
recurrence relation (i.e., valuation formula): 


fr =e" Ont +UA) (9.21) 


The price of the option at current time fy and spot Sp is given by the last iterate, V(Sp, to) = f. 
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FIGURE 9.2 A schematic of upper and lower bands of option prices (i.e., the outer node values 
fo. f3, m= 0,..., M) for two different lattice geometries corresponding to lattice volatilities o, 
(dashed lines) and ø, (solid lines). The lower lattice volatility value o, gives a lower estimate of the 
reference market option price, while the higher value o, gives an upper estimate of the market option 
price. The lattice volatility ø (for given time step At and interest rate r) that prices the market option 
value exactly lies in the interval o} < 0 < 04. 


To price American options, the method is similar, except an adjustment is made to account 
for the possibility of early exercise. Namely, the risk-neutral valuation formula is now (see 
Section 1.14.1 on dynamic programming): 


ff = max (as), ei (pf + (1— aD) (9.22) 


In the lattice calibration step, one has to adjust the lattice volaility to match the price 
of the single at-the-money option used as calibration target. Figure 9.2 shows a schematic 
representation of the lattice calibration procedure. Notice that the optimal value for the lattice 
volatility @ does not necessarily coincide with the Black-Scholes implied volatility o” of 
the option, but it converges to this value in the limit of time steps of vanishing length. The 
calibration procedure requires the use of a root-finding algorithm. The existence of a root is 
guaranteed with both choices p = i and u = i, for in both cases the resulting families of 
binomial models allow for arbitrarily large or small values of the volatility. The worksheet 
bin contains an at-the-money European call as the calibration target or reference call option 
contract. The option is quoted in terms of a Black-Scholes implied volatility a’. The market 
price results from the Black-Scholes formula 


Cret = C(So, Kef» r, o’, Tief =. ty). (9.23) 


The current time is denoted by tọ and the spot is Sy. To determine the lattice volatility o, one 
has to find a root of the equation 


Jo = fo (F, r, At) = Cret (9.24) 


for a given choice of r and lattice geometry. Here we have explicitly written the dependence 
of the binomial approximation to the price, i.e., >, in terms of the lattice parameters. The 
value of fj is found iteratively using equation (9.21) or equation (9.22), depending on 
whether the option is European or American, respectively. The final time condition is given 
by equation (9.19) for calls and equation (9.20) for puts. The value for the strike is set 
as K = Ke. Having finally obtained a value for ø, the model can be used to price other 
American or European options. ô 


CHAPTER * 10 


Project: The Trinomial Lattice Model 


10.1 


The main task in this project is to build a trinomial lattice model to price European and 
American claims within an explicit finite-difference scheme. Both drifted and nondrifted 
types of lattice geometries are considered. For the drifted lattice model, the drift is adjusted 
to account for the prevailing interest rate so as to maintain risk neutrality. As with binomial 
models, the model is parameterized by means of a suitably defined lattice volatility, which 
is then calibrated to match the price of a given at-the-money European option. Option 
prices are obtained for all single-barrier and plain-vanilla European as well as American- 
style claims. Extensions to Derman—Kani (i.e., local volatility) trinomial trees are left to the 
interested reader. 

Worksheets: pded1, pded2 

Required Libraries: MFioxl, MFBlas, MFFuncs, MFZero, MFStat 


Building the Lattice 


Trinomial lattices are normally based on lattices of fixed geometry and parameterized by 
the nodal transition probabilities. Consider a recombining two-dimensional tree with a total 
number of time steps M > 1. The nodes of the tree are placed along the time lines 1,,, 
m=0,1,...,M, where the initial (e.g., present) calendar time is denoted by tọ. We will 
denote the time to expiry by T, which defines a time step of size At = (T — tọ)/M (ie., 
tu = T) for the lattice. At the mth time step, there are (2m+ 1) nodes in a standard trinomial 
lattice. 

The nodes are chosen on a log-rectangular grid and can be generally expressed as follows: 


Serres (10.1) 


forn=—m,—m+1,...,0,...,m—1,m. The spot (i.e., initial stock price) is Sọ = S}. The 
choice of the parameters u and Ax is discussed shortly. Note that by taking logarithms of 
equation 10.1, Ax gives a measure for the change in log S within a given time slice. Namely, 
A,, log S} = log S7 — log S7? = Ax gives the node spacing for a fixed value of time. Changes 
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10.1.1 


due to a possible drift can arise from the difference A,, log S” = log S”+! — log S” = u At. 
In the stochastic process underlying the trinomial tree model, stock prices can jump from a 
node S” to the nodes Sri with n' = n,n 1. There are three transition probabilities, p}, 
Po, and p_, that correspond to an upward move, middle move (i.e., no move for zero drift), 
and downward move, respectively, for any trinomial tree. These risk-neutral probabilities are 
subject to two constraints; the first is that of probability conservation, 





p} +p+p-=!1. (10.2) 


A trinomial tree is recombining, and the nodes span a cone within a rectangular grid 
arrangement in log-stock and time space (see Figure 10.1). Notice, though, that, as we discuss 
later, to price several options of different strikes at once, it is useful to extend the trinomial 
lattice to cover the complete rectangular grid of (2M + 1)(M + 1) points, so at every time 
step m we have (2M + 1) points S”, where n = —M,...,0,..., M. 

In what follows we present three different geometric constructions of trinomial lattices. 
The first two, Cases 1 and 2, assume u = 0, while the third asks for an additional constraint on 
the probability amplitudes and adjusts the drift u in such a way as to achieve risk neutrality. 


Case 1 (u =0) 


Since u = 0, the risk-neutrality constraint E[S, |S, = $] = e'“'S gives: 


tnt 
pie" + ptp e~ =e, (10.3) 


Probability conservation (10.2) allows one to eliminate the variable pọ = 1 — (p, + p), 
and gives 


p,(e* —1)+p_(e“*—-1) =e -—1. (10.4) 


























FIGURE 10.1 A schematic of the nondrifted (u = 0) trinomial lattice originating at current time t = to 
with stock level Se . At every time slice f,,, a stock at level S™” can change to Smer with n =n,n1. 
The drifted lattice has a similar geometry, except all nodes are shifted by an amount exp( At) after 
every time step. 
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Let us introduce a lattice volatility parameter æ by means of the following equation, 
similar to equation (9.4): 

p (e~ = 1) +p_ (ore = 1) = e+ )At =]; (10.5) 


Equations (10.4) and (10.5) are a linear system in the two unknowns p,, p_. By solving them 
we find the transition probabilities as a function of ø and r: 


(a 1)(e"™ 1) (e^ 1) (etA = 1) 











= T 10.6 
P+ (e4* — 1)(e-24* — 1) — (e-4* — 1)(e24* —1) (10.6) 
(e™ —1)- (e®™- 1)p4 
= : 10.7 
= (e-4*— 1) (10.7) 
10.1.2 Case 2 (Another Geometry with u = 0) 
An alternative definition for the lattice volatility is 
p,(Ax)* + p_(—Ax)? + p0 = o At. (10.8) 


10.1.3 


This is also an acceptable definition because in the continuous-time limit it also converges to 
the Black-Scholes volatility. With this equation, we have 


o? At 


Equations (10.4) and (10.9) is a linear system of two equations in the two unknowns p}, p_. 
Solving gives 


= (e4* — 1)a7 At/(Ax)? — (e™’ — 1) 








= 10.10 
P- (e^ _ e^) ( ) 
and 
S (10.11) 
pes (Ax? p_- x 


The expressions for the probabilities are in this case slightly simpler than in Case 1. 

For both choices of the lattice volatility, one has to select appropriate values for Ax, 
given a time step Af, so as to obtain acceptable probabilities, i.e., p, > 0, p_ > 0, and 
p,+p_ < 1. For the simpler Case 2, we see immediately from equation (10.9) that the 
constraint o? At/(Ax)* < 1 must be obeyed. This is related to the usual stability constraint 
that arises in the direct, or explicit, PDE method for solving the Black-Scholes equation. 


Case 3 (Geometry with p, = p_: Drifted Lattice) 


If we ask for the symmetry condition p, = p_ = p, we need to adjust the lattice drift w in 
such a way as to satisfy the risk-neutrality condition. Equation (10.2) gives pọ = 1 — 2p, which 
is used to eliminate pọ. This is used in the risk-neutrality condition, which now includes the 
overall drift 


petty p_ etaa i pet — ert (10.12) 


344 


CHAPTER 10. Project: The trinomial lattice model 


and, this, gives 
ple™ +e) +(1—2p) = e^, (10.13) 
Taking logarithms gives the drift in terms of all other parameters: 
1 
=r- A; oe 2p (cosh Ax — 1)+1). (10.14) 


To define the lattice volatility parameter and express p in terms of it, we set 





(P+ + p_)(Ax)’ = 0° At, (10.15) 
which reduces to 
o At 
= . 10.16 
P 2(Ax) ( ) 


As in Case 2, this equation gives the probability in terms of Ax, At, and a. A possible 
strategy is to choose a sensible value for the probability p, given At and ø, and then to arrive 
at the spacing in the logarithm of the stock price, 


Ax = oy At /2p. (10.17) 


Note that the usual stability condition for the direct PDE solution of the Black-Scholes 
equation requires p < 1/2. Ax is then given by equation (10.17) for a given value of the 
lattice volatility ø. The drift then follows from equation (10.14). 


10.2 Pricing Procedure 


Option prices are computed iteratively, starting from the maturity date T, at which point 
option prices are given by the payoff function ¢(S). We denote the value of the option at 
node S” by f” = V(S = S”, t =1,,). The final-time condition for a call struck at K is 


fit = b(SM) = max(S¥ — K, 0), (10.18) 
and for the put struck also at K 
fi" = oS) = max(K — S¥, 0). (10.19) 


Hence, prices for European-style options at each node are computed recursively using the 
risk-neutral valuation formula: 


fr =e" (pafa + Pfr +O (Pa +) fr). (10.20) 


Figure 10.2 depicts the explicit scheme inherent in equation (10.20) for propagating prices at 
each time step. The nodes are placed according to equation (10.1), as explained earlier, and 
the probabilities p,, p_ are given as described within the respective cases. The iteration is 
best accomplished using a band matrix multiplication routine in the MFBlas numerical library. 
This is possible since the pricing equation (10.20) can be rewritten in matrix format, whereby 
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m+1 

| Ins 
Pd $ gat! 
m+1 
fit 


/ 








t t 


m m+1 


FIGURE 10.2 The explicit finite difference method makes use of prices at three adjacent nodes at a 
more later time step, t = f,,,,, for propagating the price to a given node at a more current time t = t. 


the option price solution column vector denoted by f” at time t„ is (2M + 1)-dimensional 
with components fy. f mi>- -> foo -> fu- Jm: 


f” = e ATE", (10.21) 


This is a special linear system of equations with a tri-diagonal transfer matrix T. This 
(2M + 1)-dimensional matrix is given in terms of the transition probabilities for the upward 
and downward moves. Namely, 


1— (p4 + P-) Ps 0.. 0 
pP- 1= (p4 + p-) P4- 
0 p- eis ; 
T= i 0 mere (10.22) 
0 
; ; eaa Ps 
0 - -P-1— (P4 + p-) 


Note that for the drifted lattice geometry p, = p_, hence giving a symmetric banded matrix 
in this particular case. 

To price American options, the iteration proceeds similarly, except an adjustment is made 
at every time step to account for the possibility of early exercise. Namely, the risk-neutral 
valuation formula is now 


fit = max (WSP), pf PAE. PDA). 00.23) 


The price of the option at current time tọ and spot Sp is given by the last iterate, V(Sp, tọ) = fe. 

The risk-neutral condition is exactly satisfied at all nodes if we restrict ourselves to the 
grid points belonging to the interior of the cone with n = —m,...,m at the mth time slice. If 
the foregoing equations are used to price options across all grid points with n = —M,..., M, 
risk neutrality fails outside the boundaries of the cone and numerical errors arise. This is the 
case for implementing the strictly trinomial model, unless proper boundary conditions are 
imposed on the extreme nodes of the rectangular grid. 

For the case of American put options we can compute the exercise boundary as a function 
of the time to maturity T — t. The exercise-boundary value S* at time t corresponds to the 
highest value of S for which it is optimal to exercise the option rather than holding it. The 
value S*, for each t = t,,, is the largest node value S”” for which (S7) > f”, with the latter 
quantity given by the right-hand side of equation (10.20), i.e., the continuation value at t„ . 
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Calibration 


As in the binomial model, we determine the lattice volatility in such a way as to match the 
price of an at-the-money option chosen as calibration target. The resulting optimal implied 
lattice volatility computed again does not coincide with the implied Black-Scholes volatility 
o,, but it converges to this value in the limit of infinitesimal time steps. The lattice volatility 
compensates for the systematic errors in the discrete-time approximation scheme inherent in 
the trinomial method. 

Calibration requires the use of a root-finding algorithm. The procedure is similar for all 
three lattice cases, as now discussed. The pded1 spreadsheet contains a European at-the- 
money call with given maturity 7,,, and strike K,.¢ as the calibration (reference) target. The 
price of the calibration target is provided as a Black-Scholes implied volatility o”. The market 
price of this call is then given by the Black-Scholes formula 


Crop = C(So, Kiet r, O's Trop — to). (10.24) 


ty is the time at which we seek the price, and the corresponding spot price is assumed to be 
So. The implied lattice volatility ø is obtained by inverting the following equation with a root 
finder in the MFZero library: 


fo = fL, r, At, Ax) = Cor. (10.25) 


Here we have explicitly written the dependence of the trinomial lattice option price, i.e., f9, 
in terms of the lattice parameters. The value of f? is found iteratively using the earlier pricing 
equations for a European call option. Note that the interest rate r is held fixed and At is also 
fixed by the chosen number of time steps in the lattice. The value for the strike is set as 
K=K,,,, 1.e. the reference strike. 


ref? 


Pricing Barrier Options 


The procedure to price a barrier option is a modification of the method for plain-vanilla 
options, except we have to account for a boundary condition at the barrier H. We discuss, 
in detail, the case of single-barrier down-and-out options. The case of up-and-out options is 
similar, while the case of knock-in options reduces to that of knockouts thanks to the in-out 
symmetry relation 


Knock In + Knockout = Vanilla. (10.26) 


To price a down-and-out, we can assume that the spot is above the lower barrier, i.e., Sy > H; 
otherwise the option would be worthless. There is an important distinction between the cases 
u = Qand u ¥ 0. Incase u ¥ 0, it is not possible to adjust the lattice so that a horizontal line of 
lattice nodes lies exactly on the barrier. While iterating equation (10.20), one verifies the node 
positions with respect to the barrier. For all values of n for which S» < H one sets f?” = 0 in the 
pricing equations. This limitation in approximating the real location of the barrier at each time 
slice gives rise to systematic numerical errors (see also Section 11.4 and Figure 11.1). 

In the case u = 0, it is possible to adjust the lattice so that a subset of the nodes lies 
exactly on the barrier. Figure 10.3 shows such a description of a nondrifted lattice, where 
the choice of geometry is such that a set of horizontal nodes corresponds exactly to either a 
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upper barrier 











lower barrier 


t ———— 


FIGURE 10.3 The spacing of a nondrifted (u = 0) trinomial lattice can be chosen to exactly match an 
upper- or lower-barrier level along a horizontal line of nodes. 


lower or upper barrier level. Namely, one can select a spacing Ax so that for a given positive 
integer ny we have H = S,exp(—n, Ax), or, expressed otherwise: 


1 
Ax = — log(S)/H). 


ny 


The price of the down-and-out option is then obtained by iterating the pricing equations, 
whereby one considers only the nodes lying at and above the barrier, i.e., n > —n,, with the 
condition that the option prices f?” =0, n < —ny, for all m. Note that the approach can be 
used to value European as well as American-style barrier options. 


Put-Call Parity in Trinomial Lattices 


One of the consequences of risk neutrality is the put-call parity relation for European prices 
S+ P(S, t)—C(S, t) = Ke" (10.27) 


across all nodes (S, t) = (S”, tm). It is instructive to verify it directly. Consider prices at the 
final time line with m = M. Call and put prices C” and P™ satisfy, by construction, the 
put-call parity relation at the terminal nodes, 


SM +P“ _CM =K. (10.28) 


At the internal nodes, we can proceed by induction. Hence, begin by assuming that the put-call 
parity relation is satisfied at the mth time step, 


sm P™—C™ = Ken), (10.29) 
By applying equation (10.20) for the put and call we find that: 
Pr- cmt = e (py (Pm Cta) +p (Pm, —C™1) 
+C = (py + p_)) (Pr — Cr). (10.30) 





348 CHAPTER 10. Project: The trinomial lattice model 


10.6 


Using probability conservation and the induction hypothesis, P# — C# = K e7"(T—tm) — SP k= 
n, n+ 1. Equation (10.30) yields 








prot — cm! et [Ke T= — (p, Sm, + p-s% 


—“n-1 


+0 — (py + p_)) Si (10.31) 


The second term on the right-hand side of this equation simplifies to S’"~' as a consequence 
of the risk-neutrality condition. Multiplying out the discount term while using ¢,,_, = fm — At 
then gives 





smtp pm tm) = Kem), (10.32) 


Put-call parity is therefore recovered. 


Computing the Sensitivities 


The sensitivities A = 0V/dS, T = &V/0S* and the vega 0V/do" of the option value V at 
present time tọ and spot S = Sọ can be approximated by finite differences: 


_ WSy + dS, to) — V(So — dS, to) 











A 10.33 
2dS ( ) 
V(So +d, t V(So — dS, to) —2V(So, t 
pa MS + dS, to) + V(S)— dS, 10) =2V(So 0) TER 
(ds) 
ôV be V(a' + do, to) — Via! — do, to) (10.35) 
do! 2do 


The quantity dS can be chosen as a small change in spot price, e.g., dS ~ 0.0015, and 
likewise do is a small increment in the volatility, dø ~ 0.0010’, where g” is the implied 
volatility at Sọ. Note that for clarity of notation we have explicitly written the dependence of 
V on spot and volatility only, respectively. 


CHAPTER 2T 


Project: Crank—Nicolson Option 
Pricer 


11.1 


The purpose of this project is to implement an implicit finite-difference solution scheme 
to price standard as well as barrier-type European and American options using the Crank- 
Nicolson (CN) method. The CN method is also calibrated against a reference European option. 
Possible put-call parity mismatches introduced by the CN approximation are then studied 
across a whole range of values in the moneyness parameter. The unique approach makes use 
of a drifted trinomial lattice. An implementation of the CN method within nondrifted lattices 
as well as other extensions are left as exercises for the interested reader. 

Worksheets: cranicl, cranic2 

Required Libraries: MFioxl, MFStat, MFFuncs, MFBlas, MFLapack, MFZero 


The Lattice for the Crank—Nicolson Pricer 


Crank-Nicolson methods are among the more commonly used implicit finite-difference 
solvers for the Black-Scholes PDE. Here we implement a rather unique CN approach that 
borrows partly from the methodology used in the direct PDE trinomial lattice solver covered 
in the previous project. The first step is to build a trinomial lattice. This part was already 
covered in the first section of the previous project on trinomial lattice modeling and hence 
will not be repeated here. There we discussed the use of three types of lattices, two of which 
are driftless. In this project we will focus specifically on the drifted lattice approach. The 
use of nondrifted lattices in CN (which were seen in the direct trinomial solver to intro- 
duce explicit differences in the nodal transition probabilities for upward versus downward 
moves) will be left as a future exercise. As well, this project focuses on the calibration and 
subsequent pricing of plain European and single-barrier European options. The extension to 
price double-barrier options as well as American barrier options is also obvious within the 
framework provided here, although we shall leave this as a separate implementation exercise 
for the interested reader. 


349 


350 CHAPTER 11 « Project: Crank-Nicolson option pricer 


11.2 


As described in the previous trinomial project, the nodes are chosen on a log-rectangular 
grid as given by equation (10.1) with nonzero drift parameter u. For a full description of the 
lattice, see the first section of the trinomial project. Again, making use of the risk-neutrality 
condition and taking logarithms gives the drift in terms of all other parameters, as given in 
equation (10.14) and repeated here for clarity: 


1 
= r— q; log(2p(cosh Ax — 1) + 1). (11.1) 


The probability p is again given in terms of the lattice volatility parameter ø, the spacing Ax, 
and At: 


_ At (11.2) 
a 2(Ax)?° f 
As in the direct method, one chooses a sensible value for the probability p, given a Af and a 
g, and then arrives at the spacing in the logarithm of the stock price: 


Ax = oy At/2p. (11.3) 


Note that p is normally chosen in the range 0 < p < L, although the CN method can be shown 
to be stable and convergent for all p > 0. To reiterate, the M + 1 time slices are chosen with 
time step At = (T — tọ)/M, where T is the maturity time and tọ denotes present time. 


Pricing with Crank—Nicolson 


Here we shall explicitly discuss the pricing of European-style options; the extension to 
Americans is obvious and introduces the same extra step as discussed in the previous project. 
The pricing equations for the CN method differ significantly from the direct trinomial pricer, 
in that propagation of the solution takes into account both backward and forward motion. In 
particular, one can relate the option prices f” = V(S’",t,,) at the nodes S” for time ¢,, to 
the option prices f”+! = V(S”+!, t,,,,) at nodes S”*' for future time tn}; = fm + At, via the 
probability p for forward-time upward and downward stock motion, as follows: 


AHDS 2 + ft) = ml nya + BOR fa | (11.4) 


Note the difference between this and the explicit finite-difference approach used in the 
trinomial lattice project (see Figure 10.2). In this implicit CN scheme, prices at three nodes 
at a later time step are related to prices at three nodes before a time step. Equation (11.4) can 
be rewritten in matrix format in which the option price solution column vector denoted by f” 
at time ¢,, is (2M + 1)-dimensional, with components fy, fugi- +> Foo -o Sui Sar: 


Ze” = et Af", (11.5) 


This is a linear system of equations with tridiagonal matrices A, Z given in terms of the 
transfer matrix T, 


3 1 
Z= 31-17, (11.6) 


A= 51+4+3T, (11.7) 
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where 1 is the (2M + 1)-dimensional identity matrix and 


1-—2p p 0 Ob abe. oS 0 
p 1-2p p 0 . 
0 p 1-2p p 0 
T= : 0 : im g : (11.8) 
0 
; : i n E p 
0 ; i . 0 p 1l—2p 


To implement the CN method, equation (11.5) is solved at every time step beginning with 
the known terminal payoff vector whose components are given by 


f” = max(S — K, 0) (11.9) 


n 


for the case of a call struck at K and 


f” = max(K — S”, 0) (11.10) 
for a put struck at K. Equation (11.5) constitutes a system of 2M + 1 equations in the 2M + 1 
unknowns f?” with band diagonal matrix Z and is hence readily solved by LU factorization. 
The routine GBSV in the MFLapack library within MFlibs is useful for this purpose once 
the matrices A, Z have been transformed to band matrix format. The latter operation is easily 
accomplished using the routines ST2B and GT2B in MFBlas. Having solved for f“~! by using 
the known payoff solution vector f”, the procedure is then iterated by solving equation (11.5) 
for f”. Iterating M times in this fashion gives the option price vectors at all time slices, 
including the vector f° at present time fo. 

As a final important note, we observe that the CN pricing equation (11.5) assumes that 
the lattice grid takes into account large enough values of S% and small enough values of S”, 
where the put and call are negligible, respectively. Moreover, we have purposely excluded 
the proper corrections from the boundaries into the matrix pricing equations. The reader can 
experiment with the inclusion of boundary conditions at the lower and upper extremities of 
the rectangular grid . Without such inclusions the present CN approach will fail to correctly 
price options at nodes outside of the proper trinomial lattice (i.e., for nodes lying above or 
below the outer cone of the tree). 


Calibration 


As in the direct trinomial model, the lattice volatility is determined in such a way as to match 
the price of an at-the-money European option chosen as a calibration target. The resulting 
optimal implied lattice volatility computed again does not coincide with the implied Black— 
Scholes volatility o}, but it converges to this value in the limit of infinitesimal time steps. As 
in the other lattice methods, the lattice volatility compensates partly for the systematic errors 
in the discrete-time approximation scheme inherent in the trinomial method. 

Calibration requires the use of a root-finding algorithm. The cranicl1 spreadsheet contains 
a European at-the-money call with given maturity Tp and strike K,e¢ as the calibration target 
or reference. The price of the calibration target is provided as a Black-Scholes implied 
volatility o”. The market price of this call is then given by the Black-Scholes formula: 


Crop = C(So, Kroes F, o’, Treg — to). (11.11) 
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ty is the time at which we seek the price, and the corresponding spot price is assumed to be 
So. The implied lattice volatility ø is obtained by inverting the following equation with a root 
finder in the MFZero library: 


fo = Fo (0, r, At, Ax) = Ces- (11.12) 


Here we have explicitly written the dependence of the CN approximation on the price, i.e., 
fe, in terms of the lattice parameters. The value of fj is found iteratively using the earlier 
pricing equations for a European call option. Note that the interest rate r is held fixed and 
that At is also fixed by the chosen number of time steps in the lattice. The value for the strike 
is set as K = Kef- 


Pricing Barrier Options 


The procedure for pricing barrier options is almost identical to what is formulated in 
Section 10.4 of the previous project. One important distinction arises, however, when using 
a drifted lattice (as is the case in the current CN approach) versus a nondrifted lattice. The 
differences that arise between the use of drifted and nondrifted lattices were also briefly 
discussed in the previous project, where the nondrifted lattice was favored over the use of 
drifted lattices when pricing options with a constant barrier level. Within the CN drifted-lattice 
approach, the price of a up-and-out barrier call, for example, with barrier at S = H, requires 
one to employ the pricing procedure as given in Section 11.2. At each time ¢,,, however, the 
price components f?” must be reset to zero for all n > ny (i.e., all nodes at and above the 
barrier level H) before the next propagation time step. The integer ny can be taken to be 
the least integer value of n such that S$” > H. Figure 11.1 demonstrates a possible source of 
inaccuracy arising from the use of a drifted-lattice geometry when pricing a barrier option, 
with barrier level at a fixed height. The zero-boundary conditions imposed on the “boundary” 
nodes creates only a coarse approximation to the actual horizontal straight-line barrier. Note 





approximated barrier 


upper barrier 


FIGURE 11.1 A drifted trinomial lattice used to price a barrier option. The barrier level lies along a 
horizontal line, which is inaccurately approximated by the zero-boundary nodes. 
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that as one makes the time step smaller and smaller, this approximation becomes more and 
more accurate. In the limit At — 0, this approximation becomes exact. 

The pricing of down-and-out options is similar, while the pricing of knock-in options 
reduces to that of knockouts, thanks to the in-out symmetry relation of equation (10.26). The 
reader may note that the spreadsheet can also be readily extended to include the pricing of 
American barrier options, if desired. 
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CHAPTER” B 


Project: Static Hedging of Barrier 
Options 


12.1 


12.1.1 


The objective of this study is to hedge European barrier options by means of a static replication 
strategy involving a market-restricted set of available plain-vanilla European call and put 
options. The hedge trade occurs at the initial time and is unwound either at maturity or when 
the barrier is crossed. 

Worksheets: bhedge 

Required Libraries: MFioxl, MFBlas, MFFuncs, MFLapack, MFStat, MFCollection 


Analytical Pricing Formulas for Barrier Options 
We consider four flavors of single-barrier options: (1) down-and-out, (2) up-and-out, (3) down- 


and-in, (4) up-and-in. Each option can be either a call or a put, for a total of eight different 
types of contracts. 


Exact Formulas for Barrier Calls for the Case H < K 
Let us recall from Section 3.3 the pricing formulas for barrier options in the geometric 
Brownian motion model. The European down-and-out call option has nonzero value only for 
S> H: 

C°°(S, K,T — t) = C(S, K, T — t) — (S/M)CP C(H?/S, K,T — t), (12.1) 
with k = r/ (307). This shows that the barrier option value at spot S > H can be expressed 
in terms of the plain-vanilla call evaluated at effective spot values of S and H?/S. The 
corresponding down-and-in call option value is then 


CS, K, T —t) = C(S, K, T — t) —C°(S, K,T — t). (12.2) 
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The formula for the value of the call C(S, K,T — t) is given by the plain Black-Scholes 
formula. Using it gives [i.e., equation (3.52)]: 


CP°(S, K, T) = SN(d,(S/K)) — Ke™ N(d,(S/K)) 
—S(H/S)‘*'N(d,(H?/SK)) 
+Ke—"(H/S)*"'N(d,(H*/SK)), (12.3) 
where d,(x) and d,(x) are defined as 


logx+ (r+ $o°)r _ logx 
O/T oft 





d,(x) = + h(k+ Dov, (12.4) 


d(x) = d (x) — oy. (12.5) 


Note that we have reexpressed the formulas in terms of the time to maturity T = T — t. As 
well, for clarity the obvious dependence on k and o,/7 within the functions d, and d, is not 
written explicitly. Note that the down-and-in call option C?!(S, K, T) expressed in terms of 
cumulative normal distribution functions is just the negative of the sum of the last two terms 
in equation (12.3). 

In contrast to the down-and-out call, the up-and-out call option is defined to have nonzero 
value for values S < H and also has the same pay-off, namely, that of the plain call struck 
at K. The European up-and-out call option is zero for all spot values in the case H < K, i.e., 
C¥° = 0. This follows since the asset price S must be below the barrier, S < H, for nonzero 
values of the option. However, the pay-off is that of a call struck at K, where K > H, which 
always gives a pay-off of zero, hence CY? = 0. Then from in-out symmetry we immediately 
have CUl(S, K, 7) = C(S, K, 7). 


12.1.2 Exact Formulas for Barrier Calls for the Case H > K 


For a European down-and-out call option value we have [i.e., equation (3.51)]: 
CP°(S, K, T) = SN(d,(S/H)) — Ke" N(d,(S/H)) 
—S(H/S)'" N(d,(H/S)) 
+Ke~"(H/S)"'N(d,(H/S))], (12.6) 
and from symmetry the corresponding down-and-in call has value 
CPS, K, T) = C(S, K, tT) —C?°(S, K, 7). (12.7) 
The European up-and-in call option now has value [i.e., equation (3.62)]: 
C'S, K, T) = SN(d,(S/H)) — Ke™ N(d(S/H)) 
—S(H/S)'"'[N(—d,(H?*/SK)) — N(—d,(H/S))] 
+Ke~" (H/S) '[N(—d,(H°/SK)) — M(—4,(H/S))], (12.8) 
and from symmetry the corresponding up-and-out call has value 


C°°(S, K, T) = C(S, K, T) —C(S, K, 7). (12.9) 
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12.1.3 Exact Formulas for Barrier Puts for the Case H < K 


All four cases of put barrier options are defined in the same fashion as their corresponding 
call barrier options, except the payoff function is that of the plain-vanilla put rather than 
the call. 

A European up-and-out put option with H < K has price [i-e., equation (3.57)]: 


PY°(S, K, T1) = —SN(—d,(S/H)) + Ke~” N(—d,(S/H)) 
+S(H/S)"*'N(—d,(H/S)) 
—Ke~"(H/S)*"'N(—d,(H/S)), (12.10) 


and symmetry gives the up-and-in put in terms of the plain-vanilla put price P(S, K, 7), 
P(S, K, T) = P(S, K, 7) — PY? (S, K, 7), (12.11) 
where 
P(S, K, T) = —SN(—d,(S/K)) + Ke" N(—d,(S/K)). (12.12) 
The down-and-in put price takes the form [i-e., equation (3.56)]: 


P(S, K, T) = —SM(—d,(S/H)) + Ke“ N(—d,(S/H)) 
+S(H/S)'*'[N(d,(H?/SK)) — N(d,(H/S))] 
—Ke~" (H/S) [N(d,(H?°/SK)) — N(d,(H/S))], (12.13) 
and symmetry gives the down-and-out put, 
PP?°(S, K, T) = P(S, K, T) — PP! (S, K, 7). (12.14) 
12.1.4 Exact Formulas for Barrier Puts for the Case H > K 
The European down-and-out put PPO(S, K, 7) = 0. This result is obtained since for any S 
value below the barrier H we have PP° = 0. The pay-off is also zero unless S < K, in which 
case S < H, giving PP? = 0. Hence PPO? = 0 for all spot values in the allowed range S > H. 
The down-and-in put follows from usual symmetry, 
P?'(S, K, T) = P(S, K, 7). (12.15) 


The up-and-in put price for H > K is given by [from equation (3.58)] 


P(S, K, T) = —S(H/S)**'N(—d, (H° /SK)) 
+Ke™" (H/S) 'N(—d, (H° /SK)), (12.16) 


with symmetry giving 


P°°(S, K, T) = P(S, K, T) — P” (S, K, 7). (12.17) 


358 CHAPTER 12 « Project: Static hedging of barrier options 


12.2 Replication of Up-and-Out Barrier Options 


Let us consider in detail the problem of replicating a European up-and-out call option CV° 
struck at K with barrier H and maturing at time T. The underlying theory of replication tells 
us that if two portfolios have equal value along the boundaries in S, t space, then their worth 
is also the same at all points that are interior to the boundaries. For the call barrier in question, 
the payoff function gives us a choice of boundary, with S = H defining a line of constant S 
joining another line at the point (H, T) given by t = T. The t = T boundary gives 


C° (S, K, T= 0) = max(S — K, 0), S <H, (12.18) 
while the S = H boundary gives 
C'°(S = H, K, 7) =0, for all 7. (12.19) 


The other obvious boundary is at S = 0, where all is zero and any portfolio consisting of a 
linear combination of vanilla calls will automatically match this value. 

We now consider replicating the up-and-out call with a portfolio consisting of a linear 
combination of plain calls: 


I(S, t) = C(S, K, T — t) +54, O(7 —1)C(S, K;,T;—1). (12.20) 


i 
i=0 


Note that we are using a notation that makes explicit the maturities T, of the various options. 
The first term is just a call struck at K with the same maturity T as the barrier option. The 
second term involves a linear combination of positions a; in the plain calls struck at T,, where 
t < T; < T. Note also the use of the Heaviside step function defined by O(x) = 1 for x > 0 
and @(x) = 0 for x < 0. This function is used explicitly to emphasize that any option is set to 
zero value for any negative value of time to maturity T, — t i.e., the particular option becomes 
excluded from the hedge portfolio. We also make the choice T) = T, so one of the calls with 
position dy in the sum also has the same maturity as the barrier option. This call, as well 
as all other calls in the sum, is meant to have strike K; ~ H. That is, the strikes Ķ, are not 
necessarily set exactly equal to the barrier level H. This is meant to simulate a more realistic 
situation in which a trader does not have all strikes with given maturity always available. 
This situation arises in the bhedge spreadsheet application and is captured by use of the input 
field corresponding to the “precentage away from barrier.” Let us denote this quantity by 
p”, where 0% < p” < 100%. Then for every ith available maturity date T;, the hedge should 
proceed to include the strike K, closest to H for which |K; — H|/H < p”, if any such strikes 
are available. If none are available, then such a term is eliminated from the sum, giving fewer 
contract terms available for the replication and, hence, for the hedge. 

The problem is then reduced to finding the N constants a; that give us the hedge positions 
to be shorted (i.e., the a; are actually the negative positions for the hedge portfolio). For a 
replication at today’s time we interpret ¢ as current calendar time and S as the spot, with 
II(S, t) in equation (12.20) being the value of our approximate replicating portfolio consisting 
of a; positions in each call maturing at T,. This problem is formulated in a precise manner 
as follows. We consider a sequence of M increasing times: tọ < tf; <--- <fty_, with tọ > t 
and ty_, < T. For convenience we pick these t, to be equally spaced. Moreover, we have 
the extra freedom of generally using more time slices than maturities T,; i.e., M > N. Note 
that the times ¢, will not necessarily coincide with the maturities T;. See Figure 12.1 for a 
schematic of the replication strategy. 
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FIGURE 12.1 The replication strategy for an up-and-out European option with upper barrier at stock 
level S = H. 


We then match II(H, t) = CY°(H, K,T — t) =0 at each t = t, according to equa- 
tion (12.19), which upon using equation (12.20) leads to the generally overdetermined linear 
system of M equations in the N unknowns a;. In matrix form, 


N-1 
Y A,ia;=b,  a@=0,1,...,M—1, (12.21) 


L 


where the M x N matrix with elements A,; is given by 


Agi = O(T, — ty) CCH, K;, T, — ty) (12.22) 


t 


and 
b, =—C(H, K,T — t). (12.23) 


This system is solved numerically by finding the minimum-norm solution via a singular value 
decomposition approach. The linear algebra library MFLapack is useful for this purpose. The 
solution vector is a = (dp, 44, . . - , 4—1). The replicating portfolio has value nI1(S, t), where 
7 denotes the total position (number of purchased contracts) in the barrier option CV°. The 
total position that is shorted in each replicating ith call is then ņa;. The exact (i.e., target) 
portfolio value 7CY° is plotted against the result II, and one observes the mismatch between 
the two smooth curves as a function of S at today’s time t. The range of S should be chosen 
judiciously within Smin < S < Snax. where Smax = H and Smin is considerably less than K but 
greater than zero. Note that for the case H > K one uses equation (12.9) for CYO, and for 
H < K one simply has C° = 0. 

Typical results for the bhedge spreadsheet should indicate two smooth curves for 
the exact and approximate values of the portfolios as a function of spot S. Agreement 
should be overall quite good, within less than 5% for most points, and with maximum 
observed deviations of about only 10%. One can also experiment with increasing the num- 
ber of equations (i.e., the number M of time slices). As one increases this number past 
N, the results should not change in any noticeable way. Hence, the solution is provided 
for any M > N choice. Figure 12.2 shows a replication of an up-and-out European call 
while allowing the possible strikes K; to deviate within 10% at most from the barrier; 
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FIGURE 12.2 Comparison of an actual replication against the exact value for an up-and-out call option 


with upper barrier at H = 780, strike K = 550, interest rate r = 7%, current date t = September 1, 1999, 
and maturity date T = April 15, 2000. 


i.e., p = 10%. The number of contracts is 7 = 100,000. The replicating portfolio consists of 
twelve plain-vanilla calls with positions: 100,000; 15,026.21; 11,953; 26,377.41; 42,179.29; 
27,038.67; 105,598.27; 160,870.81; 67,371.38; 412,469.94; 1,495,103.62; —2,595,565.28, 
strikes: 550; 780; 775; 780; 780; 775; 780; 780; 780; 780; 780; 780, and matu- 
rities: 15-Apr-2000; 1-Oct-1999; 15-Oct-1999; 1-Nov-1999; 1-Dec-1999; 15-Dec-1999; 
15-Jan-2000; 15-Feb-2000; 1-Mar-2000; 15-Mar-2000; 1-Apr-2000; 15-Apr-2000, respec- 
tively. The number of time slices was chosen to be M = 20. In practice, the computed 
positions can be rounded to the nearest integer multiple of 100. For example, 15,026.21 may 
be rounded to 15,000 (i.e., 150 lots of contract size 100). 

This completes the discussion and static hedge implementation for up-and-out barrier 
calls. For the case of up-and-out barrier puts, the implementation is very similar, except for 
an important difference. The replication is again accomplished via a linear combination in 
vanilla calls, except the first term is a position in a put (rather than a call) struck at K. From 
put-call parity one also sees that the put can also be replaced by a call plus a position in cash 
(or bond) and a stock position. For the up-and-out put option with barrier at H, the analogous 
replication formula as in equation (12.20) is now 


N-1 
II(S, t) = P(S, K, T—1) + Y a,@(T, — CCS, K;, T,— t). (12.24) 


i=0 


One then solves a matrix equation of the same form as equation (12.21), where the M x N 
matrix with elements A,; is again given by equation (12.22), but now 


by = —P(H, K,T —t,). (12.25) 


a =— 


Note that the boundary condition along the line S = 0, all t < T, is automatically satisfied for 
any choice of the a; since the calls all have zero value at S = 0. This gives 


TI(S = 0, t) = P(S =0, K, T — t) = Ke", (12.26) 


which must be the case for the up-and-out put option value when S = 0. 


12.3 
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Replication of Down-and-Out Barrier Options 


The preceding section treats up-and-out type of knockout options. To replicate down-and-out 
options, the treatment is similar, except the expansion term in the a; is now done in puts 
rather than calls. This is dictated by the difference in the boundary conditions. That is, the 
zero-boundary conditions at S = H and at the t= T lines are the same, but the boundary at 
the S = 0 line is now replaced by the boundary condition at S — oo. Any linear combination 
of puts will give a zero-boundary condition. This latter boundary condition is convenient 
when expanding in puts as used here. Figure 12.3 gives a schematic of the barrier replication 
for down-and-out European options. 
In particular, for a down-and-out call option we can consider replication using 


N-1 
II(S, t) = C(S, K, T—1)+ >> a,O(T; — 1) P(S, K;, T; — t). (12.27) 


i=0 


One then solves a matrix equation of the same form as equation (12.21), where the M x N 
matrix with elements A,; is now given by 


Agi = O(T; — ta)P(H, Ki, T;— ta) (12.28) 
and 
b, = —C(H, K,T —t,). (12.29) 


Note that the boundary condition II(S, £) > C(S, K,T — t) as S > œ is then automatically 
satisfied, as required. 
The case of the down-and-out put option is then handled via the portfolio in puts: 


II(S, t) = P(S, K, T—1) ie a,@(T, — t)P(S, K;, T,— f). (12.30) 


i 
i=0 


One then solves a matrix equation of the same form, where the M x N matrix with elements 
A,; is exactly as in equation (12.28), and the coefficients b, are now 


b, = —P(H, K,T —t,). (12.31) 


y= 
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FIGURE 12.3 The replication strategy for a down-and-out European option with lower barrier at stock 
level S = H. 
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FIGURE 12.4 Comparison of an actual replication against the exact value for a down-and-out call 
option with upper barrier at H = 450, strike K = 500, interest rate r = 7%, current date t = September 1, 
1999, and maturity date T = March 1, 2000. 


The correct boundary condition I(S, t) > P(S, K, T — t) > 0, as S > œ is also automatically 
satisfied. 

Figure 12.4 shows a replication of a down-and-out European call when only allowing 
for strikes K; to match exactly the barrier level; i.e., p = 0%. The nominal amount of such 
barrier contracts is 7 = 100,000. The replicating portfolio consists of five plain-vanilla puts 
with positions: 100,000; —39,151.60; —36,764.17; —2,047.41; —129.29, strikes: 500; 450; 
450; 450; 450, and maturities: 1-Mar-2000; 15-Oct-1999; 1-Jan-2000; 1-Feb-2000; 1-Mar- 
2000, respectively. The number of time slices was again chosen to be M = 20. Note that the 
replication is relatively more accurate for the down-and-out versus the up-and-out call. 

The preceding four replication strategies take care of all possible single-barrier European 
calls or puts, since the corresponding knock-in option values follow in a trivial manner from 
the aforementioned knock-in—knockout symmetry relationship. 


CHAPTER © 13 


Project: Variance Swaps 


13.1 


Variance swaps are hedged by a combination of a dynamic and a static hedging strategy. The 
static part involves a replication of a logarithmic payoff function. The objective of this study 
is to construct the logarithmic payoff replication and hence to find hedge ratios for the static 
part of the strategy. This can be achieved by combining positions in calls (or puts), stocks, 
and bonds. 

Worksheets: varswaps 

Required Libraries: MFioxl, MFBlas, MFFuncs, MFLapack, MF Collection 


The Logarithmic Pay-Off 


A variance swap is a forward contract on an annualized variance or the square of the realized 
volatility. The payoff @ at final expiry time T is given by 


P =N x (0}— Ku), (13.1) 


where K,,,, is a fixed swap rate (i.e., the variance swap rate), JV is the notional amount of the 
swap in dollars per annualized volatility point squared, and o% is the realized variance (in 
annual terms) of an underlying market observable over the life of the contract. The underlying 
can be a stock price, a futures price, an index, etc. In Chapter 1, we discussed such a contract 
in detail where the underlying was chosen as a futures price and we showed how to (i) derive 
a fair value for the swap rate and (ii) replicate the realized variance in terms of a trading 
strategy involving a dynamic and a static component. In this project the goal is only to 
replicate the static component of the contract. Moreover, we shall assume that the underlying 
asset is a stock with price S, at time t. Two definitions of the historically realized variance 
are possible, depending on whether we use log-returns, in which case it is defined by 
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or arithmetic returns, in which case it is defined by 


(eae eee eae 


ning Sit 


(13.3) 


The S; are quoted stock prices at interval times indexed by i. Note that if these are taken as 
daily closing prices, then one must convert the variance into a per annum basis (i.e., in terms 
of annualized variance). An example of a variance swap contract is a notional amount of 
N = $100,000/(one volatility point)’, with delivery swap rate of K,,, = (15%)? per annum 
on the S&P 500 daily closing index and maturity of | year. 

A simple mathematical model can be constructed on the assumption that the stock price 
follows a diffusion process with stochastic time-dependent volatility o, and constant drift 
(given by the interest rate r within an assumed risk-neutral measure): 


dS 
A =r dt+0, dW,. (13.4) 
t 
The realized variance defining the pay-off is assumed to be given by the stochastic integral 


1 T 
o= a o dt. (13.5) 


Let F, = eT- S, be the forward-price process. It follows from equation (13.4) that 








dF, 
F =o, dW,. (13.6) 
Ito’s lemma gives 
dF, o 
dlog F, = — — ~ dt. 13.7 
aes (13.7) 
Hence, integrating gives 
2 r" (dF, 2 Fr, 2 ff dF, 
2 t T t 
== —-—dlogF,)= 1 ; 13.8 
=f (Gt-awer,) =-FieZ +5 fF (13.8) 


The first term of the realized variance (i.e., the logarithmic function) can be replicated stati- 
cally, while the second term can be replicated dynamically by means of a self-financing strat- 
egy. In this project we shall only concern ourselves with replication of the static component. 


Static Hedging: Replication of a Logarithmic Pay-Off 


Logarithmic contracts are synthetic and, as such, are not traded directly, but they can be 
approximately replicated by means of portfolios in standard call or put options. Consider the 
following logarithmic payoff function: 


2 Sr 
S;)=—=log{ — }. 13.9 
fS,)=—7 108 (5) (139) 
In practice, only a limited set of strikes is available for trading. In the varswaps worksheet, 
all available call options are represented in a table. Each column corresponds to a given 
maturity date and contains all the possible strikes assumed available for trading (for that 
given maturity date). The problem is to approximately replicate the pay-off in equation (13.9) 
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by weighting positions in calls corresponding to only the available strikes as well as allowing 
for a variable cash (or bond) and stock position. 
Consider the finite expansion 


N 
f(Sp) ~ w + woSp + >> w; max(S7 — K;, 0). (13.10) 


i=1 


The coefficient w_, gives the (dollar) cash position, while the coefficient wọ gives the stock 
position, and the coefficients w; give the positions in the calls struck at values K; (i.e., the 
select set of calls with such strikes that are actually available for trading with given maturity). 
The goal is to find the positions w; providing the best fit in the least squares sense for the 
log-payoff on the left-hand side. More precisely, we determine the N +2 weights w; by 
matching approximate payoff function (13.10) with the exact logarithmic payoff function at 
M number of points in the final stock price S;: SŁ, S7,..., SX, where M > N +2. This leads 
to the linear system of M equations in the N + 2 unknown weights w;: 


N 
FCS) ~ wy + WoS7 +} w;max(S; —K;,0), j=1,...,M. (13.11) 
i=l 
In matrix form this system is 
N 
Ve Anw =b,  j=1,...,M, (13.12) 
i=-1 
where 
A,,-1 = (cash) or A; = e'" (bond), (13.13) 
Ajo = St, (13.14) 
Aji=max(S;—K;,0); i21, (13.15) 
and 
b, = f(s}) = a St (13.16) 
j ~ JT) = 8 5) ; 


The points S are chosen so they are equally spaced (although they can also be unequally 
spaced) and the spot Sọ lies near the middle of the price range [S}, S¥]. The system of 
equations (13.12) is solved numerically by finding the minimum-norm solution via, for 
example, a singular value decomposition of the matrix of elements A; j. The linear algebra 
numerical library MFLapack is useful for this purpose. This gives the required solution vector 
of all weights: w = (w_,, Wọ, W,,..-, Wy). Note that when these weights are multiplied by 
the nominal position, we refer to them as the hedge ratios. This gives us the approximate 
replicating portfolio, with the pay-off approximating the target pay-off in equation (13.9). 
One can plot the exact target function f(S;) alongside the approximate function given by 
equation (13.10) as functions of final stock price S, in an appropriate range [Smin; Smax] 
with Smin = S} and Smax = Si. Typically, when using on the order of only five different 
strikes, one should observe fairly good agreement across all stock prices (i.e., 1-5% relative 
error). Also, the results should display the approximate payoff function as being greater 
than or equal to the target payoff function for all points in Sp. An example of an actual 
calculation is displayed in Figure 13.1. There the comparison is for a logarithmic pay-off with 
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FIGURE 13.1 A comparison of an actual logarithmic pay-off with a replicating portfolio achieved 
using a cash and stock position as well as positions in only five available calls at strikes K, = 343.04, 
K, = 505.10, K} = 783.52, K, = 1,137.93, K; = 876.90. 
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FIGURE 13.2 Decomposition of the achieved logarithmic payoff function of Figure 13.1 as a sum of 
pay-offs corresponding to positions in cash, stock, and one short and four long positions in calls at 
strikes K, = 343.04, K, = 505.10, K, = 783.52, K, = 1,137.93, K; = 876.90. 
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spot Sọ = 758, where the set of five strikes used are the ones corresponding to the available 
call contracts with time to maturity of T = 21 days. The pay-off has been rescaled by a 
notional amount of $1 million per volatility point squared. Rapid convergence is observed 
with the use of only M = 7 stock price slices, although the plot shown is with M = 50. 
Figure 13.2 shows the decomposition of the replicating portfolio for the achieved payoff 
curve of Figure 13.1. The positions are: w_, = 699.84, wọ = 78,673.12, w, = —161,399.18 
(short position), w, = 28,390.46, w, = 15,552.98, wy = 9,429.72, ws = 3,717.31. Examples 
of other detailed replications are found on the varswaps spreadsheet. 
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CHAPTER  ¢ 14 


Project: Monte Carlo Value-at-Risk 
for Delta-Gamma Portfolios 


14.1 


The objective of this project is to develop multivariate Monte Carlo simulation procedures 
for computing the probability density function for the delta-gamma change in portfolio as 
well as to estimate portfolio value-at-risk (VaR) for a given confidence level. Two different 
probability distributions for the risk-factor returns are considered: multivariate normal and 
multivariate Student t-distributions. This project allows the reader to explore some of the 
differences between the use of a normal distribution and a heavy-tailed distribution model 
when computing VaR. This project also constitutes a template for future analysis of delta- 
gamma portfolio VaR under more general heavy-tailed distributions via the implementation 
of t-Copula methods. Such methods allow one to compute VaR for distributions where the 
returns can possess different degrees of freedom for different risk factors. 

Worksheet: var 

Required Libraries: MFioxl, MFBlas, MFLapack, MFRangen, MFStat, MFSort 


Multivariate Normal Distribution 


Let B(V) be the cumulative distribution function for the P&L of a portfolio. Let P(V) = 
the probability that the change in portfolio value, denoted by AV, is less than or equal to 
a value V. Postulating a multivariate distribution p(r) for the returns, we have [see equa- 
tion (4.72) of Chapter 4]: 


&(V) =P(AV < V) = Í p(r)O(V — AV(r))dr. (14.1) 


Here r denotes the vector of returns r7 = (r,,..., 7) and the integration is over the complete 
n-dimensional space of all risk-factor returns. [Note: Here we use superscript T for the 
transpose of a matrix (or vector); e.g., r is nx 1 and r7 is 1 x n.] The function @(x) is 
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the Heaviside step function. Let us consider the case of a multivariate (Gaussian) normal 
probability distribution function in the space of the risk-factor returns 


pfr) = 





exp(—$r’Cu'r), (14.2) 


1 
v (277)"|C| 


where C is the n x n covariance matrix and |C| is its determinant. The covariance matrix is 
assumed already given in this project (i.e., generated at random), although it can be readily 
computed from the returns corresponding to a risk-factor time series. 

The value of the portfolio is assumed to be a function of n risk factors denoted by 
X,,...,X,. We shall denote a change in risk factors by the vector dx, hence giving the return 
for the ith factor as r; = dx;/x;. Given a change in risk factors, the change in value of the 
portfolio within the delta-gamma approximation [neglecting the © term in equation (4.27) 
which is trivial to include] is assumed to be given by the second-order Taylor expansion: 


AV(r) =r A+ Sr’Tr. (14.3) 


The n-dimensional delta vector A has components 


ôV 
A; = xi; —, (14.4) 
Ox; 
and the n x n gamma matrix has elements 
eV 
T; = xx; ——. 14.5 
ij 5 XiX ðx;ðx; qta) 


The first step is to generate a random delta vector, gamma matrix, and covariance matrix. 
This is the functionality of the randomize button on the var spreadsheet. In essence, one is 
fabricating sensitivities for a fictitious portfolio.! The gamma matrix must have the property 
that it is symmetric, [7 = T. The covariance matrix must be symmetric positive-definite. 
Based on these greeks and the covariance matrix, one then computes VaR and P&L using a 
plain Monte Carlo technique as follows. 

To implement a plain Monte Carlo algorithm without invoking any additional variance 
reduction approaches, one begins by performing a Cholesky factorization of the covariance 
matrix 


C=U'U. (14.6) 


This factorization is done only once, at the beginning of a simulation. The numerical library 
class MFLapack is useful for this purpose. Scenarios can then be generated in a two-step 
procedure. First, one samples vectors of independent standard normals y®, with components 
drawn independently from the standard normal distribution, y® ~ N(0,1),k=1,...,n. The 
vectors y®, i=1,...,N,, represent intermediate ith scenarios. The random-number library 
class MFRangen is useful for this purpose. In the second step, the vectors y® are transformed 
into actual scenario vectors for the correlated returns using 


r =Uy®, (14.7) 


' However, the user can also run a VaR simulation by inputting the values A; and T;; precomputed for an actual 
portfolio with position (0,,..., Oy) in N assets or subportfolios (see Sections 4.2.1 and 4.2.2). 


14.2 
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These two steps are repeated N, times, where N, is the total number of scenarios in the simu- 
lation. The scenarios are then distributed according to r ~ N(0, C). For each scenario vector 
r© we evaluate the portfolio variation, AV? = AV(r =r), as given by equation (14.3). 
Once the portfolio variations under all scenarios are obtained, the outcomes are sorted in 
increasing order, where AV“*+!) > AV, The MFSort library is useful for this purpose. 

Given the sorted portfolio variations AV“ that were generated from the scenarios, the 
value-at-risk (VaR), defined by the probability 


P(AV < —VaR) = p, (14.8) 


where p = 1—a and a is the confidence level (typically a = 95% to 99%), is then estimated 
as VaR = —AV'llP™1)_ Here [[x]] denotes the integer part of a number x. 


Multivariate Student t-Distributions 


A popular model that introduces fat tails in the returns is the multivariate Student t-distribution 
with pdf, 





ow T((v+n)/2) rcr > = 
Pol) = Dorce (14 3 ) me) 


where I(-) is the gamma function. In Chapter 4, this distribution was discussed for the 
univariate case n = 1. In contrast to the multivariate normal density given by equation (14.2), 
this density allows for an additional parameter v, i.e., the degrees of freedom parameter. Small 
values of v ~ 3 are not uncommon in historical time series and lead to fat-tailed distributions. 
The value of v is an input to the calculations of VaR and P&L. It is interesting to point 
out a few special properties of the multivariate t-distribution. For values v > 2, t-distributed 
random variables with density [with density given by equation (14.9)] can be shown to have 
covariance matrix (—*,)C. In the special case that C has all unit diagonals, it follows that C 
corresponds to the correlation matrix of the distribution, with each marginal being a univariate 
t with common degrees of freedom v > 2. More generally each variable has the distribution 
of a scaled t random variable with v degrees of freedom. Another important property that can 
be numerically investigated in this project is that the multivariate t-density converges to the 
multivariate normal density, i.e., equation (14.9) becomes equation (14.2) in the limit v > oo. 

A useful relationship between random variables of a multivariate t-distribution and 
those drawn from a multivariate normal is as follows. Assume the random vector R? = 
(R,,..-,R,) ~ 6 (0, C); i.e., this is shorthand notation for a random vector whose components 
are jointly distributed according to the multivariate t-density in equation (14.9). Then R has 
the same distribution as the vector given by X/./Y/v, where XT = (X,,..., X,) ~ N(0, C) 
and Y ~ x? (is a chi-squared random variable with v degrees of freedom) independent of 
(X,,...,X,). A chi-squared random number with assumed integer v is generated simply 
by summing up v independent and identically distributed standard normals z; ~ N(0, 1): 
Y= Vie i rae From this property we conclude that a multivariate t random vector R is gener- 
ated by a multivariate normal vector with an independent randomly scaled covariance matrix, 
i.e., using 


Roe UZ _urR 
VY JYJ f 


This result obtains from equation (14.6) with R being a t random vector of uncorrelated (yet 
not independent, since they share a common Y random variable) components: R ~ t,(0, I), I 











(14.10) 
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is the n x n identity matrix. The random vector Z ~ N(0, T) is an n x 1 vector of independent 
standard normal components. 

Hence, given an integer degree of freedom v > 2, the simulation procedure for generating 
multivariate t random vectors is similar to the procedure for generating multivariate normals, 
with only slight modifications. In fact, relation (14.10) points to the specific recipe. The 
covariance matrix is Cholesky factorized only once at the beginning of the simulation. 
Then for each return scenario r®, a vector z“ ~ N(0, I) is generated and independently a 
random chi-squared y; value is generated. Then using equation (14.10): r® = U?z / Jyi/¥, 
i=1,...,N,. Each ith scenario can therefore be obtained by generating n+ v independent 
standard normal random numbers. The random-number library routine gennor within the 
MFRangen class is useful for this purpose. 
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FIGURE 14.1 Histograms of the P&L, within the delta-gamma approximation, for a generic portfolio 
of 10 risk factors using the multivariate (a) normal versus (b) Student t-(v = 3 degrees of freedom) 
distributions, respectively. The number of scenarios is N, = 2000. Random gamma and delta sensitivites 
were chosen identically for both distributions, while, for precentile p = 1%, the computed values for 
VaR were 17.53 versus 28.03 for distributions (a) and (b), respectively. 
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For both multivariate normal and Student t-distributions, one should observe 1-5% 
statistical error when using a number of scenarios on the order of N, ~ 10,000. Moreover, 
the results of the simulations should demonstrate fatter tails for the P&L corresponding to 
the Student t-distribution as well as a respectively larger value for VaR at a given percentile. 
As well, one should observe a much more pronounced effect as the degrees of freedom v is 
decreased. Figure 14.1 gives a comparison of the P&L and VaR for an actual Monte Carlo 
simulation on a portfolio of 10 risk factors. 
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CH AP TER +15 


Project: Covariance Estimation and 
Scenario Generation in Value-at-Risk 


15.1 


This project investigates the covariance properties of return time series generated by a 
multivariate Monte Carlo simulation. 

In particular, one generates a random, symmetric positive-definite matrix with a specific set 
of eigenvalues. The matrix is then interpreted as the covariance matrix of a multivariate normal 
distribution, and multivariate normal scenarios are subsequently generated. The covariance 
matrix is finally reestimated with one of the methods typically used in VaR implementations. 
The reestimated covariance is then analyzed in terms of eigenvalue spectral concentration as 
compared with the original covariance matrix. 

Worksheet: recov 

Required Libraries: MFioxl, MFBlas, MFLapack, MFFuncs, MFRangen, MFSort 


Generating Covariance Matrices of a Given Spectrum 


In this section we discribe a technique for generating a random positive-definite symmetric 
N-dimensional matrix with a specific preassigned set of eigenvalues. The first step is to 
generate a symmetric positive-definite (SPD) matrix. Two alternatives are possible. The 
first is to simply use the MATP routine in the random-number (and random-matrix) library 
MFRangen. This will generate an SPD type of matrix of a given dimension N as specified 
by the user. The other alternative is to generate an upper triangular random matrix (using the 
normal random-number generator of MFRangen). This preliminary matrix, A, can then be 
used to form the matrix B = A’ A (superscript T stands for matrix transpose). Now, B is of type 
SPD as long as one makes sure that all of the generated diagonal elements of A are nonzero. 

Note that the matrix B can, in principle, represent a covariance matrix. Most probably, 
however, this matrix will largely be dominated by only one or a very small number of 
principal components. It is of interest in the present study to consider covariance matrices 
whose eigenvalues are more equally spaced. In particular, one can enforce a set of preassigned 
eigenvalues. This leads to the next step, namely, creating a covariance matrix of given 
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15.2 


eigenvalues. To accomplish this, one makes use of the MFLapack library and performs a 
singular value decomposition (SVD) on the matrix B, 


B = OBO”. (15.1) 


The matrix O is now an orthonormal matrix whose columns O, are normalized eigenvec- 
tors of B. B is the diagonal matrix of eigenvalues 6, of B. That is, BO, = 8,O,;, with O,- 
O; = 6;;, i, j/=1,...,N. The O, are essentially the randomly generated principal compo- 
nents of the covariance matrix. After having performed this SVD, the eigenvalues 6; are 
readjusted (i.e., specifically reassigned) by making the change 8; > a; for a chosen set of a;, 
i=1,...,N. The new diagonal matrix œ of eigenvalues œ, is then used to give the desired 
covariance matrix: 


C=O0a0’. (15.2) 


One is now at liberty to choose an eigenvalue set. For example, by setting a; = 
k(./N +1 —~/i) for some positive constant K, one has a slowly decaying spectral density 
(i.e., eigenvalue density) as one moves away from the origin of zero eigenvalue. 


Reestimating the Covariance Matrix and the Spectral Shift 


As in the previous VaR project, we assume a multivariate normal distribution given by 
equation (14.2) for the returns. Scenarios are then generated for returns r using the same 
procedure described in detail in the plain Monte Carlo approach of the VaR project. Namely, 
one generates a vector y of independent standard normals and multiplies this vector by 
the Cholesky factored form of the foregoing covariance matrix C of equation (15.2). This 
gives a scenario r™, Each r“ is then used to form the exponentially weighted sum over N, 
scenarios: 


N, 
x —1 (k) (k 
Chea) Da on? (15.3) 
for all i, j= 1,...,N and where A is a damping parameter or decay factor strictly less than 


unity, 0 < A < 1. In particular, the value for A is typically chosen between 0.94 to 0.97. 
The choice of A = 0.97 roughly corresponds to assuming a 1-year time window of trading 
days. This parameter, therefore, determines the relative weights given to past observations 
(i.e., the return scenarios) and hence the amount of data that is actually used to estimate 
the variance-covariance of the return time series. The factor (1 — A) is a normalization since 
Xio MK © (1— A)! for large n. Note that equation (15.3) is not applicable for the special 
case À = 1. Hence, for zero damping (A = 1) one must replace equation (15.3) by 


= 


^ 1A Ok 
CR Pre, (15.4) 


s k=1 


Note that the time series considered here are scenario sets, which are quite lengthy, 
typically of order 10,000. 

Having estimated the covariance matrix using equation (15.3) or (15.4), one can then 
compare the Ĉĉ i elements with the original matrix elements C;;. A more interesting compari- 


son, however, is obtained by computing the eigenvalues for both C and Ĉ` matrices. Earlier 
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FIGURE 15.1 Eigenvalue distributions for a 100-dimensional covariance matrix using 10,000 scenar- 
ios. The recovered distribution is computed with damping factor A = 0.97. 


we denoted the eigenvalues of C by a, @,...,@y. Correspondingly, we will denote the 
eigenvalues of C> by a}, a4, ..., a}. Note that these eigenvalues can be obtained in a variety 
of ways, one of which is the singular value decomposition, as given earlier, of the respective 
covariance matrices. The so-called vectors of singular values give the sets of eigenvalues. 
The objective is to compare eigenvalues in terms of the density (or distribution) for the a; 
versus the distribution in the a}. The eigenvalue density f(a) at the point æ is defined as 
the number of eigenvalues lying between a and a+ da for infinitesimal da. The densities 
are actually estimated by considering histogram plots of the respective eigenvalue sets. The 
density plots should demonstrate a probability increase or shift of distribution toward the 
origin in the spectrum of eigenvalues as the decay factor A is decreased from 1.0 to 0.94, the 
latter case corresponding to more damping of past observations. Figure 15.1 gives a histogram 
comparison of actual versus recovered eigenvalue distributions for a covariance matrix with 
100 risk factors, as generated in the recov spreadsheet. A simple extension to this project is 
to include an analysis of the differences in the principal components of C and C^. 
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CHAPTER 16 


Project: Interest Rate Trees: 
Calibration and Pricing 


16.1 


The purpose of this project is essentially twofold: to calibrate interest rate trees against 
market discount (zero-coupon) curves and to subsequently use the calibrated lattices to 
price interest rate products, such as bonds, bond options, caplets, floorlets, and swaptions. 
The theory and implementation allow for four different stochastic interest rate models: 
Black—Derman—Toy and Ho-Lee (within a binomial lattice approach) and the Hull—White 
model and Black—Karasinski (within a trinomial lattice approach). 

Worksheets: ir and ycec 

Required Libraries: MFioxl, MFFuncs, MFBlas, MFLapack, MFFit 


Background Theory 


In developing interest rate trees we consider a subdivision of calendar time t € [0, T] into 
M subintervals [T) = 0,7,, T,,..., Ty] with time spacing At = T; — T,_,. Throughout we 
shall assume equal time steps, although the lattice methods we present can be extended to 
the more general case of unequal time steps. Discount bond prices at current calendar time 
t maturing at calendar time T are denoted by Z,(T) (= Z,(r,, T)) (see Chapter 2). Consider, 
then, a generic (European-style) security with payoff function A(r;, T) depending only on the 
value attained at maturity time T for the short rate rr. If one assumes market completeness, 
the arbitrage-free price of such a security, at current time t = 0, is given by the expectation 


Py(%s T) = P(T) = Ba] exp (f nds) Arn n|; (16.1) 


Here, the numeraire is chosen as the rolled-over money market account B, = eh rds as 
discussed in Chapter 2. In more explicit terms, this expectation (which is conditional on the 
short rate’s having value rọ at time t = 0) has the form of an infinite product of conditional 
integrations for every incremental time At — 0. In particular, if we denote the risk-neutral 
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conditional probability density that the short rate will attain a value r; at time T,, given r; 
at time 7,_,, by p(r;, 7;_,; At), then (for a generally stochastic r, diffusion process) the price 


"iT 


can be accurately approximated by an M-dimensional integral, 


œo [o.e] M 
P(r, T) = [ af [I Plr, ri; AeA (ry, T)dry... dr, (16.2) 
#1 


where Ty = T is the terminal time. In the limit At > 0 (or M —> œ since T = M Ar) this 
gives an exact path integral representation of the price. Lattice methods arise by choosing a 
finite number M of time slices and evaluating equation (16.2) by using efficiently recombining 
lattice point integral approximations. For zero-coupon bonds we have a pay-off of one dollar 
with certainty (A(r;, T) = 1); hence 


Za(T) = Zalm, T) = rf exp (—[ nas) (16.3) 


Of interest are the Arrow—Debreu prices, denoted by G(r}, 0; r, T) and given by 


T 
G(%m,0; r, T) = Ba] exp (-/ rd) Br —r) 
0 





T,9 = n , (16.4) 


which is the expectation of an infinitely narrow butterfly spread pay-off (i.e., the Dirac delta 
function) conditional on the short rate’s starting at rọ at time t = 0. These correspond to the 
worth at time ¢ = 0, given (i.e., conditional on) current state rọ, of a riskless security that 
pays one dollar if state r; = r is attained at any later time T > 0. The zero-coupon bonds are 
expressed in terms of the Arrow—Debreu values as follows: 


Z,(T) = ip G(r», 0; 7, Tar. (16.5) 


An important consistency requirement is the continuity relation 





G (ro, 0; mt)= G (ro, 0; rii Ti) G41. T13 rp Tans. (16.6) 
0 


This formula is the basis for a discrete version that is used in the sections that follow 
to generate a forward induction procedure for propagating the Arrow—Debreu prices. The 
function G(r;_,, T;—1; r; T;) is the Arrow—Debreu value conditional on the short rate’s having 


value r;_, at time 7;_, and attaining a value of r, at a later time T, > T,_,. We conclude this 
section by noting that the quantity Z(r, t, t+ At) = Z,(r, t+ At) defined by the conditional 


expectation 
t+At 
Z(r.t +41) =B,) exp (— f r as) n=r] 
t 


af dry G(r, t, rp, T=t+At) (16.7) 
0 





gives the price of a discount bond at time ft > 0 (any time later than current time), with 
time to maturity of At, conditional on the short rate’s having value r at time t. Note that 
here we have explicitly denoted the conditional nature of the expectation. This formula, in 
conjunction with concatenating equation (16.6) for every time step T, — T;,_,, forms the basis 
for producing lattice pricing formulas of derivatives, such as caplets, floorlets, and swaptions 
dealt with later. 
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16.2 Binomial Lattice Calibration for Discount Bonds 


In developing binomial interest rate trees we subdivide calendar time t € [0, T] into M subin- 


tervals [T) = 0, 7,, T>,..., Ty] with time spacing At = T, — T,_,. At each time t = T, there 
are i+1 nodes corresponding to the attainable values of the short rate r(j, i) = r (j, T;), 
j=0,1,...,i. Note that throughout we use a notation for the short rate whereby r, denotes 


the continuous short rate variable at calendar time t, whereas r (j, i) = r (j, T;) corresponds 
to the discretized short rate value at the node with state j and time 7;. Also, note that the 
indexing of the nodes in binomial models is such that the index has nonnegative value: j > 0. 
Figure 16.1 gives a schematic of the binomial interest rate tree. The two binomial models 
considered in this project are the Ho—Lee (HL) and Black-Derman—Toy (BDT) models. The 
HL model is the simplest, with no mean reversion. The HL model follows a normal stochastic 
process 


dr, = X(t)dt+ a(t)dW,, (16.8) 


where A(t) and o(t) are deterministic drift and volatility functions, respectively. One obvious 
shortcoming of this model is the admittance of negative interest rates. The BDT model 
removes these deficiencies by considering the logarithm of the short rate, which is assumed 
to follow a stochastic process of the form 


dlogr, = [eo + L dog a(t)) log n| dt+oa(t)dW,, (16.9) 


where a(t) is the lognormal volatility and the drift function allows for a drift component 

as well as a mean-reversion component for the variable logr,. Note that the speed of the 

mean reversion is zero for the case of constant volatility. Note that throughout this study we 

shall assume a constant volatility. Hence, mean reversion shall remain zero in the current 

implementation of the BDT model. In contrast, mean reversion is introduced in later sections 

where we implement the Hull-White and Black—Karasinski models using trinomial lattices. 
The HL lattice model can be defined by a set of nodes placed according to 


r(j,i) =r(j—1, i) +20V At, (16.10) 
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FIGURE 16.1 A binomial lattice originating at the current short-rate node r (0, 0). 
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whereas the BDT lattice can be taken as 


r(j,i)=r(j— 1, i)exp(2ov At), (16.11) 
for given time slice T; = i At, i=0,1,...,M. Here ø is a (lattice) volatility parameter 


for the short rate. Throughout the calibration to market zero-coupon bonds, as discussed in 
this section, the volatility shall be preset to a fixed value, independent of time. It should be 
noted that there exist a whole range of fixed ø values that can produce identical matches 
to the same set of market zero-coupon bond prices. Different values for this parameter have 
the effect of shifting the overall drift of the tree so as to still keep it risk neutral. Note that 
the assumption of a fixed volatility eliminates the reversion component in the BDT model. 
By allowing the volatility ø to be time dependent, one can further calibrate to a larger set of 
market instruments besides zero-coupon bonds. The first step to consider in the calibration of 
discount bond prices is the interpolation of yields from given treasury yield data. Consider the 
set of maturities T = T, i = 1,2,...,M, and set the current time t = T) = 0. The discount 
curve for the calibration consists of the set of prices {Z(T,), Zo(T>), . . - , Zo(Tu)} derived 
from the set of corresponding yields y,(T;) = {yo(T,), Yo(T>), - - - , Yo(Ty)}. This set of yields 
does not, in practice, match the actual input market set of N maturity yields given at a fixed 
set of times denoted by the set {yo (T1), yo(T>), - - - , Yo(Ty)}. The latter are the actual treasury 
yields at times 7, = 3 months, T, = 6 months, etc. The foregoing discount prices are therefore 
obtained after having interpolated for the yields y,(7;) at each ith time step. This must be 
done either by employing a simple linear interpolation or by using a spline-fitting algorithm 
of higher order, such as a cubic spline. The MFFit numerical library class is useful for this 
purpose. 

Lattice methods correspond to fixing the number of integrations in all the equations of 
the previous section into some fixed integer, such as the number of time steps in the case of 
pricing, and, in turn, evaluating each integral using only two (for the case of a binomial lattice) 
or three (for trinomial lattices) points of integration. An important approximation underlying 
the binomial lattice methodology is to set the conditional transition density for every time 
step At simply as a constant, p(7;, r;_;; At) = F Moreover, the short rate is taken as locally 
constant 7;_, within time intervals [7;_,, T,], hence giving the conditional Arrow—Debreu 
values for Ar maturity as the simple form G(r, T;—1; rj, T;) = G(k, Tii; j, T) = e^, 
By adopting the binomial short-rate lattices defined by equation (16.10) (for the HL model) 
or (16.11) (for the BDT model), we now are in a position to obtain the discrete-time versions 
of the equations in the previous section. 

To begin with, equation (16.6) takes the discrete form 


G(0, 0; j, T;) = 2 G(0, 0; k, Ti1)G(k, T-1; j, T;), (16.12) 


k=j—1,j,;0<k<i-1 


where 





1 
Gik: T,__,; j, T;) = 5 exp[—r(k, i— 1) Ar]. (16.13) 


Note that equation (16.12) describes a procedure that takes into account the Arrow—Debreu 
prices at intermediate nodes r (k, i— 1) for the previous time 7;_,, which are subsequently 
used for time stepping by an amount Ar until a terminal node r (j, i) is reached at the time 
slice T; = T,_, + At. The sum involves only two possible values for k: k = j— 1 and k = j, 
with the restriction that 0 < k < i— 1. For the extreme (highest or lowest) node there is 
only one term in the sum. This is the forward induction equation that is used in practice to 
generate all Arrow—Debreu prices G(0, 0; j, T;) for each jth node at terminal time T;. It is 
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FIGURE 16.2 A pictorial representation of the forward propagation of Arrow—Debreu prices on a 
binomial lattice originating at the current short-rate node r (0,0). Their values, i.e., G(0, 0; j, T,), at 
nodes corresponding to a later time step t; are obtained as a sum of contributions from (at most) two 
intermediate time T,_; two-legged paths. 


important to note that this forward induction equation can generally be used for any type of 
short-rate model (see Figure 16.2). The discrete-time version of the security price given by 
equation (16.2) takes the form 


Po(ro, Ti) = %60,0; J T)AC G, T;), T;), (16.14) 


j=0 


where G(0, 0; j, T,) are computed using the forward induction relation in equation (16.12). 
Specializing this formula to the case of zero-coupon bonds, which have a riskless pay-off of 
one dollar, we have the discrete-time lattice version of equation (16.5): 


ZT) = £ GO, 0; j, T). (16.15) 


j=0 


Hence, the Arrow—Debreu prices at all the nodes of a given maturity T are sufficient for 
determining the price of a discount bond of that maturity (see Figure 16.3). In the calibration 
procedure the market zero-coupon prices at times T = T, are used as input to the left-hand 
side of equation (16.15). By solving for the nodes at the (i— 1)th time step, for every time 
slice T;, we imply the whole lattice and hence obtain the market prices of all discount 
bonds correctly. In practice, the right-hand side, for each T = T,, is computed by using the 
vector of Arrow—Debreu values G(0, 0; k, T;,_,), k=0,1,...,i—1, which are assumed to 
be known from the previous time step, as well as a trial vector of nodes r(k, i— 1). These are 
plugged into forward induction equation (16.12) while using equation (16.13) and summing 
up all node contributions via equation (16.15). At the same time, one also makes use of the 
constraint among the r (k, i— 1), namely, equation (16.10) or (16.11), depending on whether 
one is calibrating the Ho—Lee or BDT model, respectively. Hence, this reduces the discount 
bond calibration problem to a succession of M root-finding problems that make the left- 
and right-hand sides of equation (16.15) equal for each T,. Note also the single-variable 
nature of the problem, since the expressions are reduced to finding just one node, i.e., the 
lowest one r (0, i— 1), and the rest follow for time slice T;,_,. Observe that at each maturity 
the nodes being computed are lagged by one time step. One can use the MFZero library 
class for the purpose of finding roots. To start the procedure off, one uses G(0, 0; 0, 0) = 1 
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FIGURE 16.3 Lattice calibration of zero-coupon bonds of maturity T, uses a sum of Arrow—Debreu 
prices beginning from the current time Ty and node r (0, 0) up to all nodes r (j, i), j=0,..., i, at time 
T;. Note, however, that calibration up to maturity time T; determines the short-rate lattice points up to 
the previous time step with time 7,_,. 


and solves for r(0,0). Note that since there are only two branches in this case, giving 
G(0, 0; k =0, 1, At) = 4 exp[—r (0, 0) Az], one has 





log Z)(T, = At) 


r (0,0) = ar 


(16.16) 
where the first node, r (0, 0), is given by the smallest-maturity zero-coupon bond price (i.e., the 
initial term structure). If one assumes continuous compounding, then one can also avoid the 
numerical root-finding procedure in the case of the Ho—Lee model, which admits a simple 
analytical solution for the node positions r (j, i— 1) in terms of Z)(T,) and the Arrow—Debreu 
prices for terminal time T,_,. 


Binomial Pricing of Forward Rate Agreements, Swaps, 
Caplets, Floorlets, Swaptions, and Other Derivatives 


Recall from Chapter 2 the price of a plain-vanilla FRA of a given tenor T = T, — T, 


Assuming continuous compounding, equation (2.8) can be used to give the net present value 
of an FRA (to the party receiving an initial nominal amount) with one dollar nominal: 
PV(FRA), = —Z,(T,) + e0 Z,(T)), (16.17) 


where the forward is given by 





(16.18) 


HT, Te) = tog (20), 


Zo(T,) 


Since all expressions are completely determined by the prices of the zero-coupon bonds, it 
necessarily follows that all FRAs are also exactly priced by the binomial lattices obtained 
from the calibration procedure in the previous section. Moreover, as recalled from Chapter 2, 
a swap is just a collection of FRAs. Plain-vanilla swaps are, therefore, also priced exactly 
within the foregoing calibration framework. 
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The pricing of options such as caps (or caplets) is not as straightforward as that for 
FRAs. In particular, recall from Chapter 2, the pay-off of a caplet struck at fixed interest rate 
Rx, maturing at time T, on a floating reference rate R’(7T) of tenor 7 applied to the period 
[T, T +7] in the future. The floating rate is typically the three- or six-month LIBOR. The 
pay-off of this caplet option is given by 


Cpl(R (T), T) = (R(T) — Rg) 4 7Z (rr, T +7), (16.19) 


where Z,(rr7, T +7) is the discount function over that period, since the cash flow occurs at 
time T+ 7. Here we define (x), = max(x, 0) as usual. In order to obtain the price at current 
time t = 0 of this caplet one must take an cere, or integral, of the pay-off with a 
risk-neutral distribution in the reference rate Re where t = T, i.e., the expiry or maturity 
time of the option on the (call-type) pay-off. The latter is, however, expressed in terms of a 
rate applied to the period of the tenor (i.e., the reference forward rate) and not the short rate 
used in the rate lattice calibration of the previous section. In particular, the short rate lattice 
gives the conditional distribution of the short rate. 

To price the caplet, one must relate the short rate to this reference forward rate. In 
particular, the values of the short rate at the nodes (j, i), r(j, i) must be related to the values of 
the reference forward rates, denoted by R“ (j, i), at these nodes. This is achieved by using the 
continuous-time relation for the forward rate, and this is where the conditional zero-coupon 
prices Z(r, t; T) are useful. In particular, for continuous compounding, 


ROG) = “og ( AOD) ), (16.20) 





Z(r, t, t+7) 


and since Z(r, t,t) = 1, 


(16.21) 





1 
aes): 


Choosing t = T, and T = T, +n At, where it is assumed that the tenor is exactly n periods of 
the lattice time step, for some integer n, T = n At, we arrive at the discrete time value at the 
jth node: 


1 
R(t) = te ( 
T 





1 1 
RO(j, ù) = -1 : 16.22 

MURG al) ae 
Here Z(j, T,, T; +n At) = Z(j, T;, T;,,,) is the zero-coupon value maturing at time T,,,, (n 
time steps in the future), conditional on the short rate’s having value r (j, i) at time T,. Based 
on equations (16.19) and (16.22), we can write all components of the payoff vector of the 
caplet at each node (j, i), denoted by C® (j, i), as 


COU, i) = RO (J, i) — Rg) TZ(j, Ti, Tign)» (16.23) 


where equation (16.22) is plugged in for R® (j, i). Note that the preceding equations assume 
continuous compounding, while a similar set of equations obtain for the case of discrete 
compounding, where the log(x) function is simply replaced by x. The foregoing payoff vector 
introduces an extra procedural step, requiring one to compute the quantities Z(j, T;, T;,.,,), 
which involve a separate forward induction starting from the nodes r (j, i). In practice, these 
are computed using the discrete-time version of equation (16.7): 


j+n 
ZC, Tis Tin) = 2 GU, T; k, Tian), (16.24) 


k=j 
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where the (n+ 1) Arrow—Debreu values (conditional on beginning at a jth node at time T, 
and ending at node k = j, j+1,..., j+n at time 7;,,,) on the right-hand side of this equation 
are computed by forward recursion using an adaptation of equation (16.12), rewritten here in 


a slightly different form: 


1 f 
COS. “Fo S T aa (1625) 


s=k,k—1; j<s<j+m—1 


Here m = 1,2,...,n and the iteration is readily carried out from time T = T; to final time 
T;,,, Where one initially has G(j, T;; j, T;) = 1 for any j value. It is instructive to write out 


the Arrow—Debreu values explicitly for the first two time steps. For a single step (for m = 1) 
the terminal time is 7;,,, and we simply have 


1 Be 
GUT k, Ta) = 300, (16.26) 


where k = j, j+ 1 are the only two possible values for k. Not surprisingly, this is con- 
sistent with the relation in equation (16.13). Propagating out to the second step (m = 2), 
equation (16.25) gives 


1 ered | ; 
G(j, T; k, T2) ~ y Le ODAL ers it Dar (16.27) 
s=k,k—1;j<s<j+1 2 
where possible values for k are j, 7+ 1, 7-+2. Summing up the terms explicitly, these three 
Arrow—Debreu prices are 


1 r 26 
G(j, T,; ii; Tig) = A 








GG. Tx j+ 1, Ty.) = Le iele TORDA L e eel 
1 T its 
COT a ene (16.28) 


Specializing equation (16.14) we therefore finally have the binomial lattice pricing formula 
for a caplet valued at current time Tọ = 0 and maturing at time T, of tenor T =n At: 


Cpls (Rk, T;) = X G0, 0; j, TC, i). (16.29) 


j=0 


To summarize then, the application of this pricing formula contains two components. The first 
part is the computation of the G(0, 0; j, T;), which are already computed from the calibration 
step, as discussed in the previous section. The second part consists of computing the payoff 
components C (j, i). These are obtained by first computing the conditional Arrow—Debreu 
prices G(j, T;; k, T;,,,) by forward induction using equation (16.25). These quantities are 
then summed up to give the Z(j, 7;,7,,,,), aS in equation (16.24). In turn, the latter are 
plugged into equation (16.22), giving the forward rates R® (j, i), and hence C® (j, i), using 
equation (16.23). 

Figure 16.4 depicts, schematically, this procedure for pricing a caplet. For implementation 
considerations, note that the inputs within the ir sheet (for pricing a caplet) are the expiry 
time 7;, which for simplicity is assumed chosen as an integer number of time steps from 
current time Tọ, and the tenor of the caplet is chosen as an integer number of time steps 


16.3 Binomial Pricing of Forward Rate Agreements 387 








er(k=j+n, i+n) 
| CO (j,i) ee 
ri m @r(k=j+1, i+n) 
| ote | Re ete) 
b | | 
r(0, 0) 
To T; Tian 
——— ŘE 


FIGURE 16.4 Schematic representation of the separate components used for the pricing of a caplet 
option of tenor 7, expiring at time 7;. The initial leg starting from the current time node r (0, 0) gives the 
Arrow—Debreu prices G(0, 0; j, T;) at each jth node r (j, i) at time T;. The payoff vector of the caplet 
with jth component C® (j, i) (for the jth node at time T,) is obtained by summing all the Arrow—Debreu 
prices G(j, T;; k, T;,,) (k = j, . . . , j+ n) that are conditional on starting at the node r (j, i) at time T, 
and ending at nodes r (k, i+ n) at time T,,,, for the period of the caplet. 


past T,, where the time step is the previously computed At. Note that, if needed, this apparent 
restriction can be readily lifted by using a different time-step value for the lattice past-maturity 
time T,. The spreadsheet contains inputs for the number of time steps to reach the caplet 
(or floorlet) option expiry time from today, i.e., an integer M with M At = T, and another 
integer for the number of steps defining the tenor. 

The entire analysis for pricing a floorlet of the same maturity, struck at rate Rg, follows 
almost identically as in the case of the caplet, except the pay-off is now that of a put, 
(Rx — R),, instead of that of a call, (R — Rx),. Within this project one should allow for a 
computation of both types of options as well as the pricing of swaptions. 

Next, we consider the pricing of European swaptions. Such options, as discussed in 
Chapter 2, come in two flavors: The payer swaption has pay-off given by equation (2.41), 
while the receiver swaption has the put type of pay-off. Let us consider a payer swaption, 
struck at rate rg, on an underlying swap to start at time T = 7, in the future and having a 
lifetime of periods of fixed tenor T: 


PSO; = T(r} — rx) 4 >) Zr(T + p7). (16.30) 


p=1 


Here r? denotes the equilibrium swap rate at time t. Hence, the first reset time of the swap is 
assumed as T = T, , with first payment time at T + 7, the latter being the second reset time 
with second payments occurring at T +27, etc. Note that, as in the case of caplets, within 
the ir application spreadsheet the user enters both the option expiry time T and the tenor T. 
In addition, the swaption contract is defined by entering the number of periods n, with each 
time period assumed constant and given by r. In particular, given a maturity T, we choose 
a number of time steps n, up to maturity with n, At = T, thereby defining a fixed time step 
At = T/n,. The contract is assumed to be specified as having tenor 7 = m, At. The number 
of time steps within the swap is then N, = m,n, giving a swap lifetime of N, Af; i.e., the 
swap ends at calendar time given by the (n, + N,)th time slice: T +nT = T, ,y,. Figure 16.5 
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FIGURE 16.5 Time spacing for a swaption expiring at time T. The underlying swap has n equal 
periods of tenor 7. 


shows a schematic of the time spacing for the swaption. Now recall from Chapter 2 that the 
equilibrum swap rate at time T can be written as 


Zr(T)—Zr(T +n7) za LS Zr(T +n) 








rA ETAT DRR 
The pay-off then takes the form 
PSO; = (A — B). (16.32) 
Here A is a floating-rate bond 
A=1-Z,(T+n7), (16.33) 
and 
B= mS (16.34) 


p=1 


is an annuity or fixed-coupon bond originating at time T with fixed payments of amount Trg 
at n periods of time 7. The formula in equation (16.32) is directly suitable for implementation. 
Based on equations (16.33) and (16.34), the components of the payoff vector of the payer 
swaption at each jth node r (j, i = n,), denoted by P“? (j, i), are given by 


PYG, i= n,) = (1 = Z(j, Ta T,,,+N,) — Trk X Z(j, Ta Tead) g (16.35) 

p=1 + 
This pay-off therefore requires the evaluation of the zero-coupons Z(j, Ta, Ta +pm,) Ccondi- 
tional on the starting node r(j, n,) at time slice 7, and maturing at times T, 4pm P = 1,- +- , M. 


These are computed in the same manner as described for the caplet case. Namely, equa- 
tion (16.24) gives 


j+pms 


Z(j, Tra Trapa) = D G(j, Ts k, | ipm) (16.36) 
k=j 


where conditional Arrow—Debreu prices now need to be calculated at every time slice 
n, + pm, i.e., for p= 1,...,n. The procedure for doing so is the same as in the caplet case, 
where forward recursion equation (16.25) is used repeatedly. This time the recursion is carried 
out for a total of N, = m,n steps, and at each interval number p of m, steps we extract a 


(pm, + 1)-dimensional vector of Arrow—Debreu prices G(j, T, ; k, Ta +pm,) With components 


16.4 


16.4.1 
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k=j,j+1,...,j+pm,. After having obtained the conditional zero coupons, the current 
price of the payer swaption is given by discounting the payoff vector: 


PSO!” (rg, T) = Y G(0, 0; j, T)P” (j, n). (16.37) 


j=0 


Lastly, note that the pricing of reciever swaptions follows in identical manner, except that 
the pay-off is simply replaced by the put type of expression (B — A),. 


Trinomial Lattice Calibration and Pricing 


in the Hull-White Model 


The implementation of trinomial lattices for interest rate trees shares some similarities with the 
case of stock price trees covered in the previous project on trinomial lattices for pricing equity 
options. There are, however, some important differences, stemming from the fact that the 
short rate is itself stochastic and, hence, discounting is inherently very different, as we have 
seen in the binomial lattice implementation. Before proceeding to implement a specific short- 
rate lattice, it is useful to note that there are various possible acceptable tree implementations. 
Namely, one could adapt the tree methodologies used in the previous trinomial lattice project, 
which deals with stock price processes, over into the case of a short-rate process. This requires 
appropriate modifications to account for the mean-reversion effect as well as calibration to dis- 
count bond prices across all time steps. The latter would require that the transition probabilites 
(P+, Po, and p_) also depend on the nodal positions. One can, moreover, also incorporate 
a similar drift parameter (i.e., the u parameter), which would now also depend on the ith 
time slice T;. Such a viable lattice makes use of only normal branching. Here we shall devi- 
ate slightly and follow Hull and White’s two-stage tree-building procedure [HW93, HW94, 
Hul00]. As shown later, this procedure has the added advantage of separating out the reversion 
term from the drift component. As well, the sampling of the short-rate nodes in the lattice is 
done in a more efficient manner by incorporating three types of possible branching modes. 


The First Stage: The Lattice with Zero Drift 


As discussed in Chapter 2, the Hull-White (HW) model is defined by the stochastic short-rate 
process, which can be written in the form 


dr, = [0(t) — a(t)r,]dt + o(t)dW,, (16.38) 


where O(t) is a time-dependent drift term. Throughout, we shall further restrict the mean 
reversion a(t) = a and volatility a(t) = ø to be time-independent parameters. For present 
purposes this offers a reasonably good model that can be used to calibrate to zero-coupon 
bonds and subsequently to price interest rate options. Extensions that allow for the reversion 
speed and/or volatility functions to take on a time dependence (either numerically or analyti- 
cally) can also be readily achieved. This would allow for exact calibration of the lattice model 
to a larger basket of instruments besides zero-coupon bonds. We leave this as an optional 
implementation exercise for the interested reader. The first step is to construct a tree for the 
related process with zero drift (and nonzero reversion) defined by 


dr; = —ar* dt+o dW,. (16.39) 
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Fixing rž within a time step, we compute the mean and variance of the random variable 
rž „ar — T% as given by the expectations 

Efri a 7 ]=—ar; At, (16.40) 

Efra GY =e C7 (At +7 At, (16.41) 


where only terms up to order (Ar)? are included. The r* lattice has nodes defined by r (j, i) = 
r*(j, i), where 


r*(j i) =rğ+j Ar, (16.42) 


with r = 0 and j = —i,—i+1,...,i—1,i for any time slice T, =i At (see Figure 16.6). 


L 


Using equation (16.42) within equations (16.40) and (16.41) gives 
Ela 17 = j Ar] = -aj Ar At, (16.43) 
E[r a yI = j Ar] =a? P (Ar (AN +07 At. (16.44) 
At this point one finds explicit formulas for the transition probabilities p,, py, and p_ for, 
respectively, the higher, middle, and lower branches emanating from a given node r (j, i). The 
three possible branching modes considered are depicted in Figure 16.7. Note the difference 
in convention with respect to the indexing of the nodes that was used in the binomial lattice. 
As in the previous project on trinomial lattice models, an up (down) move changes the jth 


index in r(j,i) by +1 (—1), while only for a middle move j remains unchanged. For the 
case of normal branching we compute the expectations 


El (ria: lr = j Ar] = py (i+ DAr + poj Ar+ p_G-1)Ar—j Ar 
= (p,—p_)Ar (16.45) 


r(j+1, i+1) 






r(ji+1) 
r(j-1,i+1) 


To T Tp T; Tisa 


FIGURE 16.6 Schematic of the (driftless) symmetric trinomial r*-lattice for the short-rate process with 
symmetric (normal) branching from all nodes. 
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FIGURE 16.7 The three possible branching modes. 
and 
E[r a 7) rf = j Ar] = p (Ar) + po + p_(Ar)? 
=(p,+p_)(An)’, (16.46) 


where we have used probability conservation p, + pọ + p_ = 1. It has been observed in the 
past [HW94] that numerical efficiency is maximized by fixing the spacing to 


Ar =ovV3 At. (16.47) 


Using this value for the spacing and equating expectations in equations (16.45) and (16.43) 
and the expectation in equation (16.46) with that in equation (16.44) gives a linear system of 
two equations in p, and p_ with unique solution 


1 1 
P(/) = gtz” At(aj At¥ 1). (16.48) 





Probability conservation gives 


poli) =$- (aj AD). (16.49) 


Note that the argument j is used to explicitly denote the dependence of the transition proba- 
bilities on the nodal j-position value. 
A similar analysis gives the probabilities for downward branching: 


Toei 
p? (j) = gtz Aaj At—3), (16.50) 
1 1 
PL(j) = 6+ 54) Alaj At—1), (16.51) 
1 
PO) = —3— aj At(aj Ar—2). (16.52) 


The superscipt d is used to denote the transition probabilities for downward branching. 


392 


CHAPTER 16 . Project: Interest rate trees: Calibration and pricing 


Lastly, for upward branching we have 


1 1 

Pi.) = ¢ +54) At(aj At+1), (16.53) 

PER ae aes Cea 

(j= el At(aj At+3), (16.54) 
1 

Poli) = —3 — aj At(aj At+2). (16.55) 


The superscipt u denotes the transition probabilities for upward branching. 

Note that the foregoing expressions are either concave or convex quadratic functions of j. 
One can readily derive conditions on j for the transition probabilities to be strictly positive. 
Namely, for normal branching 


V273 _ ,, V2/3 
<JjJ< 











f 16.56 
a At z a At ( ) 
for upward branching 
—1— /2/3 -14+ /2/3 
/ <j< Baise ; (16.57) 
a At a At 
and for downward branching 
1— /2/3 1+ /2/3 
/ <j< eve) s (16.58) 


a At J a At 


Throughout we assume a > 0. Let us define a maximum value j,,,, as the smallest integer 
greater than (1 — ./2/3)/(a At) ~ 0.1835/(a At), for the index j at any time slice, and a 
minimum value as jin = —Jmax- This leads to the branching methodology for each node (j, i), 
whereby normal branching is used for Jnin < J <Jmax» downward branching is used for extreme 
positive value j = j,,,,, and upward branching is used for extreme negative value j = Jmin: 


16.4.2 The Second Stage: Lattice Calibration with Drift and Reversion 


The purpose of the first stage is to build the component of the r-tree (i.e., the r*-tree) that 
encapsulates the mean-reversion and volatility aspect of the short-rate process. In the final 
tree implementation, considered in this section, one needs to incorporate a drift component. 
Namely, at each time slice the nodes will be drifted by an amount determined by the market 
prices of the zero-coupon bonds. The drift component is incorporated by considering the 
difference a, = r, — rž. This satisfies an ordinary differential equation where 


da, = [0(t) — aa, |dt, (16.59) 

with solution 
a, = e" [a+ [ ‘ e6(s) ds]. (16.60) 
Here a = r = r (0, 0) since rẹ = 0. Equation (16.59) provides an apparently trivial analytical 


link between the actual r-tree and the driftless r*-tree since 6(t) can be obtained exactly 
from the initial-term structure [i.e., as function of the yield y)(t)]. Indeed, the right-hand 
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side of equation (16.60) can be computed explicitly by applying the formulas derived in 
Chapter 2. Namely, one can use equation (2.110) or (2.109) [note that there the drift function 
A(t) is called a(t) and the function b is called a here] into the integral of equation (16.60) to 
obtain a,. We will not adopt this methodology here since it leads to inaccurate results and, 
moreover, bypasses the importance of the pricing algorithm for the drifted trinomial lattice, 
which we now present. 

To apply the trinomial lattice pricing methodology we simply extend the equations of 
Section 16.2 into the trinomial lattice case. In general, we must distinguish between the 
different possible branching. Let us first assume normal branching. In this case the Green’s 
function propagation (i.e., the Arrow—Debreu forward recursion) equation (16.12) is modified 
to read 


G(0,0;j,T)= DY G0,0;k, T)G(k, 713 j, T;), (16.61) 


k=jż+1,j;|kļ<i—1 





where the Arrow—Debreu prices for a single time step are nonzero for |k| < i— 1 and k = 
j, j+1, and given by 





{ pi (eT EDA, k= j- 1, 
G(k, Tiz; J, T;) = Pele Ne, k= Js (16.62) 


p_(kKe 65DA, k= j+ 1. 


In contrast to the binomial case, the forward time propagation of Arrow—Debreu prices is 
now obtained by summing contributions up to three (as opposed to two) possible two-legged 
paths. Note that the probabilities for up/down and middle moves in equations (16.61) and 
(16.62) are the ones corresponding to normal branching. For terminal node values of j close 
to either juin OF Jmax» equations (16.61) and (16.62) need to be slightly modified. Namely, 
for any given value of j, equation (16.61) must be modified to the more general case 


G(0,0;7,T) = >> G0,0;k, T, DpH, Ke, (16.63) 


ks |k|<i-1 


This formula takes into account all (generally mixed) branching types. The quantities p(j, k) 
denote the nodal transition probabilities for all possible nonzero contributions from interme- 
diate nodes at positions k for time 7,_,. The sum of the corresponding probability values to 
be used in equation (16.63) now depend on the terminal j value. Assuming jmax > 2, there 
are possibly seven distinct cases to consider after Jaa; time steps. 


1. j= jimax gives two terms (one down branch and one normal branch) with p(j, j) = pt (j), 
PG. j—-1) =p,- 1). 

2. j = jmax — 1 gives three terms (one down branch and two normal branches) with 
PG. I+1) =p G+ 1), PG. A) = Poli). PG. j- 1) = pG- 1). 

3. min #2 < J< Jax —2 gives three terms (three normal branches) with p(j, j+ 1) = 

4. j =jmins gives two terms (one up branch and one normal branch) with p(j, k = j+1)= 
p_G+), pG, j) = p£ Q). 

5. j= Jmin + 1 gives three terms (one up branch and two normal branches) with p(j, k = 


i+) =p_Gt+)), eG j) = Pol). pO, j— 1) = POG -— 1). 
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6. j= Jax —2 gives four terms (one down branch and three normal branches) with p(j, k = 


7. j= jain +2 gives four terms (one up branch and three normal branches) with p(j, k = 


J+) = p_G+1), PG, Jj) = Pod), PG j- 1) = p40- 1), PG, 7-2) = PLG — 2). 


The forward propagation of Arrow—Debreu prices therefore involves a sum of two, three 
or four terms in cases where the terminal node is close to jax OF Jmin- Most values of j, 
however, involve normal branching, with the use of a three-term sum. 

The pricing of zero-coupon bonds is essentially similar to the binomial lattice case, in 
the sense that one iterates out to any given time slice 7, to obtain the Arrow—Debreu prices 
G(0, 0; j, T;). The analogue of equation (16.14) takes the form 


P(r, T;) = X G(0, 0; j, T)A(r G, T;), T;). (16.64) 


j=-i 


Specializing to the case of zero-coupon bonds, the equation analogous to equation (16.15) 
for pricing zero-coupon bonds is 


Z(T)= © G0,0; j, T). (16.65) 


j=-i 
Inserting equation (16.63) into equation (16.65) gives 


Z(T)= > E GO, 0k, Tp. eT eda, (16.66) 


j=-i |k\si-1 


Hence, in general, one finds that the trinomial lattice calibration for a short-rate model can 
be achieved using a numerical root-finding procedure in equation (16.66) analogous to the 
binomial lattice methodology. The HW model, however, offers extra flexibility since one can 
actually solve the calibration problem analytically in the case of continuous compounding. 
The calibration of the lattice nodes for the HW model proceeds as follows. 

The preceding formulas are specialized to the case where the actual drifted lattice is 
represented by 


with a(0) = r(0,0) as the initial node and the spacing given by equation (16.47). The 
coefficients a(i) represent the central node r (0, i) along each time slice T, and will therefore 
account for the drift of the lattice. Plugging this into equation (16.66) and taking logarithms 
we obtain the simple analytical form for the coefficients: 


log bee as G(0, 0; k, T;) pC, pew] — log Z(Ti41) 


a) At 





(16.68) 


Note that we have shifted the time slice index i to i+ 1. This gives the central node at each 
time slice 7,, and hence from equation (16.67) all nodes r (j,i) for time T; are obtained, 
based on the market price of the zero-coupon bond maturing at time 7;,, and knowledge of 
the Arrow—Debreu prices out to time 7;. These Arrow—Debreu prices are in turn given by 
forward induction using equation (16.63) by using the already-known values for the node 


positions at time slice i—1. 


16.4.3 
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One begins the calibration procedure with G(0, 0; 0,0) = 1, and the initial node r(0, 0) = 
a(0) is given in terms of the interpolated zero-coupon price at the first maturity time T, = At 
using equation (16.16). Based on this and the zero-coupon prices at further maturities, one 
obtains the rest of the lattice nodes using equations (16.68) and (16.63). For instance, after 
the first step we have normal branching with G(0, 0; 0, T,) = pye~**4', G(O, 0; £1, T;) = 
pe", Assuming normal branching, at the second time step we obtain a(1): 








log ( Dj- 10610, 0; j, 7) ) — log Z(T) 
At 


log (o- eA) + py + py enan) — log Zo (Ty) 


= xu a(0). (16.69) 


a(1) = 








This procedure is continued for the rest of the time steps, hence giving the calibrated lattice 
for as many time steps as needed. 

For the calibration of short-rate models that do not admit a simple analytical solution, 
such as the Black—Karasinski model covered in Section 16.5, one can readily proceed to find 
the central nodes numerically via a root-finding routine similar to what was described earlier 
for the binomial lattice. 


Pricing Options 


Once the calibrated lattice is built, the procedure for pricing options (e.g., caplets, floorlets, 
swaptions) follows similar steps as described for the binomial lattices given in Section 16.3. 
The conditional zero-coupon bonds are now obtained using 


j+n 
ZC, Tj; Tian) = > GU, Ta k, Tan) (16.70) 


k=j—n 


where the (2n+ 1) Arrow—Debreu values (conditional on beginning at a jth node at time T; 
and ending at node k= j—n,...,j-+n at time T;,,,) are computed by a general extension 
of equation (16.63), i.e., using the forward recursion relation 


GU, Tp eT N= X plk, s)  GG Ti s, Tpm). (16.71) 


s;|s|<i+m-1 


Just as in equation (16.63), this forward propagation formula takes into account all possible 
mixed branchings. Note that the starting node is denoted by index j, while the terminal node 
now has index k. The nodal transition probabilities p(k, s) are again given as described just 
following equation (16.63). For instance, when jimin +2 < k < jmax — 2, normal branching is 
used with three possible nonzero values for p(k, s): p(k,s = k1) = p-(k 1), p(k, s = 
k) = pa(k). 

Based on knowledge of the conditional zero-coupon prices, all option-pricing formulas 
are indentical in form to those for the binomial lattice, except for the obvious modification in 
having to compute and sum up more terms due to n extra terminal nodes for every n steps. 
Hence, for example, the caplet price is obtained by modifying equation (16.29) slightly: 

















Cpl? (Rx. T;) = X G(0,0; j, TC (J, i). (16.72) 


j=-i 
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For evaluating swaptions, the conditional zero-coupon prices are now given by a formula 
similar to equation (16.36) (except for the summation involving more nodes): 


j+pms 


Z(j, Tio To ae) = 5 G(j, T,,3 k, ee (16.73) 


k=j—pm, 


This leads to the pricing formula analogous to equation (16.37) (with payoff vector containing 
n, more components) for the payer swaption: 


PSO6” (rk, 7) = Y) G(0,0; j, T)PO G, n), (16.74) 


j=on, 


where P™?(j, n,) is again given by equation (16.35). Similar pricing formulas follow in the 
obvious manner for other instruments, such as floorlets and receiver swaptions. The interested 
reader can also apply the methodology presented here to interest rate derivatives involving 
more exotic payoff structures. 


Calibration and Pricing within the Black—Karasinski Model 
The Black—Karasinski (BK) model is described by the short-rate process 
dlogr, = [6(t) — a(t) log r,|dt + o(t)dW,, (16.75) 


where a(t) is a time-dependent mean-reversion speed. This model is the lognormal version of 
the Hull—White model, with r, replaced by logr,. Hence, a nice feature of this model is that 
the occurrence of negative interest rates is not possible. Note also that the BK model is an 
extension of the BDT model. Throughout we shall again assume a constant reversion speed 
a(t) =a and constant o(t) = a. In contrast to the BDT model, the BK model still incorporates 
mean reversion under such conditions. As mentioned for the HW model, extensions to time- 
dependent reversion and/or volatility can also be implemented with some modifications and 
are left as an optional exercise. 

The tree-building procedure for the BK model follows in similar fashion to the HW model 
as described in Sections 16.4.1 and 16.4.2. The difference here is that the short-rate node 
values are now replaced by their logarithms. Namely, the spacing takes a similar form as 
equation (16.11) except that the nodes now also drift. In particular, we define a constant 
spacing for the logarithm of the short-rate nodes: Ax = logr (j,i) —logr (j— 1, i) for any 
time slice T,, or equivalently 


r (j,i) =r (j—1, i) exp(Ax). (16.76) 


This leads to the geometry of the short-rate nodes defined by a modification of equation (16.67) 
to read 


r (j, i) = a(i)exp(j Ax), -i<j<i, (16.77) 
with a(i) = r (0, i) corresponding to the central node at time T, with a(0) = r (0, 0). Although 


we are somewhat at liberty to choose a spacing for Ax in terms of At, we shall, in analogy 
with the HW model, set the spacing as 


Ax =oa~v3 At. 
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With this choice we can carry out algebra similar to that in the HW model (see Section 16.4.1). 
Now, however, we consider the random variables x, = logr, (and equivalently x* = log r*) 
and compute conditional expectations E[(x*,,,—x7)°|x*¥ = j Ax] and E[x*,,,-—27 [xt = j Ax]. 
From equation (16.75) it is evident that x, obeys the HW model. As can easily be verified, the 
end result is that one obtains exactly the same formulas for the j-nodal transition probabilities 

pe and pł for middle and up/down moves, respectively, as in the HW case. Note, however, 
ee now the logarithmic spacing between the short-rate nodes is constant: Ax = Alogr = 
ov3 At. 

The propagation of the Arrow—Debreu prices follows the general recursion procedure 
described earlier. Namely, the Arrow—Debreu prices originating from the present node r (0, 0) 
are given recursively by equation (16.63), where the node positions r (k, i— 1) are given by 
equation (16.77). The zero-coupon bonds are again given by equation (16.66). By plugging 
equation (16.63) into equation (16.66) and the expression for the nodes given by equa- 
tion (16.77) one observes that, in contrast to the HW model, one cannot analytically solve 
for the central nodes. This is due to the fact that the grid spacing is constant in the logarithm 
of the short rate rather than the short rate itself. More explicitly, for the BK model we have 





Z(T) => >> G0, 0; k, T;_,) pC. k) exp [ — a(i— 1) Ate]. (16.78) 


j=-i k;ļkļsi-1 


Given the market zero-coupon price at maturity T, and the vector of Arrow—Debreu prices 
that are determined by forward recursion up to a previous time T,_;, the parameter a(i— 1) in 
this last equation is determined numerically via a single variable root-finding procedure. One 
can use the function zeroin of the MFZero library for this purpose. Hence, by determining 
the set of parameters a(i),i=0,1,...,M, one obtains the entire calibrated BK lattice out to 
time Ty. Option pricing within the BK model then follows the same trinomial methodology 


as in Section 16.4.3. 
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forward, 92, 189, 210-211, 215, 224 


L 


Lagrange adjoint, 189-190 
Laplace transforms, 190-191 
inverse, for Bessel process/Green’s 
function, 200, 201f, 204f, 207, 209 
inverse, for Green’s function and 
diffusion kernels, 190-191, 
194-196, 198 
Lattice (tree) methods. See also Binomial 
lattice model; Interest rate trees; 
Trinomial lattice model 
American options, 98—100, 340, 345, 353 
calibration procedure of, 99—100 
Crank-Nicolson option pricer, 349-350 
European options, 334-345, 337-340 
volatility, 338-339, 343-344, 350, 351 


LIBOR, 144 
LIBOR curve, 119 
Likelihood ratio, 261 
Linear approximation, value-at-risk, 
sensitivity analysis and, 289, 290f, 291f 
Linear ordinary differential equations, 192 
Linear volatility models 
double knockout options, 175-176, 177f 
pricing kernels, barrier options and, 
172-178, 179f 
Local volatility. See State-dependent 
volatility 
Logarithmic pay-off, 363-364 
static hedging, 364-367, 365f, 366f 
Lognormal distribution, 40, 140, 338-339 
Lognormal model, 243-245, 244f. 
See also Black-Scholes formulas 
Log-returns, quantile-quantile plot, 
standardization and, 327-329, 329f 
Long position, VaR for, 283-284, 284f 
Lower-wall options, 152 


M 


Macdonald functions. See Bessel function 
Market completeness, 7—8 
Market risk, 240 
Market strikes, 58 
Market-risk model, 308 
Markov chain, 25, 97 
Martingales, 10, 26-31, 36, 82, 121 
continuous square integrable, 28—30 
definition of, 26-27 
jump process and, 28 
MC. See Monte Carlo methods 
Mean square error 
risk-factor aggregation, dimension 
reduction and, 294-295 
small, dimension reduction, 295—298 
Measure, change of, 14 
Measure theory, 12-13 
Mellin integral, 190 
MFLapack, 365, 370 
Mef. See Moment-generating function 
Moment methods, 21—22, 264-266 
Cornish-Fisher, 265 
Johnson, 265—266 
Moment-generating function (mgf) 
Fourier transform of, 267—268 
Money-market account, 10, 59-60 
Monotonically decreasing function, 39, 218, 
219, 228, 323f 
Monotonically increasing map, 179, 
228, 324 








Monte Carlo (MC) methods, 43, 268-287 
calibration, 332-333, 333f 
chooser option, 334-335, 335f 
AT portfolios, variance reduction and, 
261-264 
multivariate normal distribution, 369-371 
multivariate student ft distribution, 
371-373, 372f 
perturbation theory, VaR and, 311-312 
pricing equity basket options, 
333-335, 335f 
scenario generation, 331-332 
value-at-risk for delta-gamma portfolios, 
369-373, 372f 
Multifactor models, 141—146 
Brace-Gatarek-Musiela-Jamshidian with 
no-arbitrage constraints, 144-146 
Heath-Jarrow-Morton with no-arbitrage 
constraints, 141, 142-143 
Multivariate continuous distributions, 16—23 
bivariate distribution, 20 
characteristic function, 21—22 
cumulative distribution function, 17—18 
moments and, 21—22 
probability densities and, 18-19 
Multivariate normal distribution, 369-371 
Multivariate risk factor models, 249, 250f 
Multivariate student ¢ distribution, 
371-373, 372f 


N 


N-dimensional case, arbitrage detection, 

formulation of arbitrage portfolios 

in, 319-321 

No-arbitrage constraints, 148 
Brace-Gatarek-Musiela-Jamshidian 

with, 144-146 

Heath-Jarrow-Morton with, 141, 142-143 

Nonanticipative function, 26, 30 

Nonlinear Volterra integral equations, 110 

Nonnegative portfolio, 15 

Nonparametric density estimator, 243, 

247-249 

Normal distribution. See Gaussian 

distribution 

Null hypothesis, 258-259 

Numeraire asset, 5, 46—47, 66 








O 


ODE. See Ordinary differential equation 

Optimal stopping time formulation, 
arbitrage-free pricing and American 
options, 93-103 
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Options. See America options; 
At-the-money option; Basket option; 
Bermudan option; Bond options; 
Butterfly spread option; Call options; 
Chooser option; Compound options; 
Currency option; Elf-X option; 
European call option; European-style 
futures options; Exotic options; 
In-the-money option; Knockin option; 
Out-of-the-money option; Pay-at-hit 
one options; Perpetual double barrier 
option; Plain-vanilla option; Put 
options; Quanto option; 

Stock options 

Ordinary differential equation (ODE), 103 

Ornstein-Uhlenbeck process, 32, 129 

Out-of-the-money option, 323, 323f 


P 


Parameter estimation and factorization, 257 
Partial differential equation (PDE), 36, 37. 
See also Kolmogorov equation 
backward, 155 
Black-Scholes equation, 37, 49, 88—89, 
90-91, 102-103, 107-108, 162, 325, 
343-344, 349 
Derman-Kani, 91—93 
dual Black-Scholes equation, 90-91 
Fokker-Planck equation, 89-90, 92 
integrated equation formulation and, 
106-112 
for pricing functions and kernels, 
88-93 
Partition of D, 15 
Parzen estimator, 248f, 311 
Parzen model, 247, 248f, 249, 250f, 281f, 
283, 285, 285f, 294 
Path-integral, 141 
Pay-at-hit one options, 152, 170 
Pay-off (Payout), 3, 316-317 
discounted expectation of future, 5, 9, 
120-121 
elementary, 152 
exponential, 56-57, 57f 
logarithmic, 363—364 
nonnegative, 94 
replication of exotic, 52-59 
sinusodial, 57, 57f 
stream, 5 
Pay-off function, 3-4, 5 
discounted, 262-263 
PDE. See Partial differential equation 
Perfectly liquid, 4 
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Perfect-markets hypothesis, 5 
Perpetual double barrier option 
continuous-time financial models, 64 
risk-neutral measures, derivative asset 
pricing, 71 
Perturbation theory, 306-312 
error bounds, condition number and, 
309-310 
first-order perturbation property proof, 
308-309 
mixture model, 311-312, 311f 
of return model, 308-312, 311f 
value-at-risk well posed, 306-308 
Plain-vanilla option, 173 
Plain-vanilla structures, 116 
Plain-vanilla swaps, 117-118 
Portfolio, 316 
arbitrage, 317-318 
change in, 240 
statistical estimations for AT, 255-261 
Portfolio composition, 286-287 
Portfolio immunization, 4 
Portfolio models 
value-at-risk, 251—254, 255f, 286-287 
Portfolio-dependent estimation 
portfolio decomposition and, 256-257 
Positive definite, matrix, 18—19 
Price, 4 
Price vector, 315 
Pricing measure, 9 
Black-Scholes formulas, 120—127 
bond future options, 126 
bond options, 124 
caplets, 123 
continuous time, 67 
futures-forward price spread, 124-125 
single-period continuous case, 15-16 
stock options with stochastic interest 
rates, 121—122 
swaptions, 122 
Pricing theory 
American options, 93—112 
analytical pricing formulas, 210-232 
arbitrage-free pricing, optimal stopping 
time formulation and American 
options, 93—103 
Black-Scholes type formulas, 77—87 
Brownian motion, martingales, stochastic 
integrals and, 23-32 
continuous state spaces in, 12—16 
continuous-time financial models, 59-65 
dynamic hedging and derivative asset 
pricing in continuous time, 65-71 
early-exercise boundary properties, 
105-106, 107f 


financial assets classes for, 3 
forwards and European calls and puts, 
46-52 
geometric Brownian motion, 37-46 
hedging with forwards and futures, 
71-77 
introduction to, 3—6 
multivariate continuous distributions, 
16-23 
partial differential equations and 
integrated equation formulation, 
106-112 
partial differential equations for pricing 
functions and kernels, 88—93 
perpetual American options, 103—105 
single-period finite financial models in, 
6-12 
static hedging, replication of exotic 
pay-offs and, 52-59 
stochastic differential equations and Itô’s 
formula, 32-37 
Probability 
historical, statistical or real-world, 6 
implied, 6 
risk-neutral (risk-adjusted), 9 
Probability conservation, 226-229, 391 
Probability densities, 89, 308 
fast convolution method and, 270, 
271f, 275 
multivariate continuous distributions and, 
18-19 
of quadratic random variable, 270 
Probability distribution function (pdf) 
continuous, 13 
joint, 16 
Probability mass function, 13-14 
Probability measures m, 12-13 
Probability space, 6, 13 
Probability theory, for random 
variables, 12-13 
Problem well posed, Hadamard, 307—308 
Pull to par effect, 124 
Pure discount bond. See Zero-coupon bond 
Put option, 222 
American, 48, 96, 107-109 
struck at K and of maturity T, 48 
Put-call parity, 48-51, 82, 322 
in trinomial lattice model, 347—348 
Put-call reversal symmetry, 83 


Q 


QR factorization, 260-261, 299-300 
Quadratic random variable, 270 


Quadratic variation, 28—29 
Quadratic volatility models 
double knockout options, 178f, 
185-187, 188f 
with one double root, 224 
pricing kernels, barrier options and, 
178-189 
x-space, F-space process, Bessel families, 
224-226, 225f 
Quantile-quantile plot, 244, 244f, 246f, 
248f, 327-329, 328f, 329f 
log-returns, standardization and, 
327-329, 329f 
Quanto option 
Black-Scholes model, 79-80 


R 


Radon-Nikodym derivative, 14, 70, 262 
Random variable, 6 
Random-walk model, 242 
with asymmetric t model, 247f 
multivariate, 249 
normal, 244f 
with Parzen density estimate, 248f 
Real estate, 3 
Real-world returns, 241 
Reduction-mapping, for pricing kernels, 
214-215, 233-235 
Redundancies, 4 
Relative asset price process, 10 
Relative returns, 242, 243f 
Relative value-at-risk, 300-301 
Return model, perturbation theory of, 
308-312, 311f 
Returns, 316-317 
Rho (p), 51 
Richardson’s extrapolation, 278—280, 280f 
Riemann integrable, 273-275 
Risk, causes of, 240 
Risk factor, 4, 240 
Risk factor models, 243—250, 255 
asymmetric student’s t, 245-246, 247f, 
281f, 283, 285, 285f 
lognormal, 243-245, 244f, 281f, 
285, 285f 
multivariate, 249, 250f 
Parzen, 247, 248f, 249, 250f, 281f, 283, 
285, 294 
Risk free, 4, 8 
Risk-factor aggregation 
95% VaR surfaces, 302f 
99% VaR surfaces, 302f, 303f, 304, 304f 
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algorithm, dimension reduction method 
1, 297-298 
algorithm, dimension reduction method 2, 
299-300 
comparison of method 1 to 2 in, 301-303, 
302f, 303f 
dimension reduction and, 294—303 
dimension reduction, optimization and, 
303-306, 304f, 305t—306t 
Lemma, 296-297 
method 1, reduction with small mean 
square error, 295-298, 301-303, 
302f, 303f 
method 2, reduction by low-rank 
approximation, 298-300, 301-303, 
302f, 303f 
Risk-neutral conditional probability density, 
40-41 
Risk-neutral measure, 10, 46, 70, 121, 136f, 
141, 143, 166 
Risk-neutral pricing, 323-324, 324f 
Risk-neutral (risk-adjusted) probability, 9, 
345, 347-348 
single-period asset pricing and, 318-319 
R-tree, 392 
Ruling out jumps, 120 
Runge-Kutta method, 110 


S 


Scenario 
in single-period models, 6 
weighted, 261 
Schmidt-Mirsky theorem, 299 
Schur decomposition, 299-300 
SDE. See Stochastic differential equation 
Self-financing replication strategy, 62—63 
Self-financing trading strategy, 62 
Sensitivity analysis 
trinomial lattice model, 348 
value-at-risk, linear approximation and, 
289, 290f, 291f 
Short position, VaR for, 282, 282f, 
283f, 311 
Short rate, one-factor models for 
Black-Karasinski, 396 
bond-pricing equilibrium, 127-129 
Cox-Ingersol-Ross, 129, 134-138 
Flesaker-Hughston, 139-140 
Hull-White, Ho-Lee and Vasicek, 
129-134, 389-390, 390f, 392 
Single-period asset pricing, 318-319 
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Single-period finite financial models, 6-12 
arbitrage, 7-8, 9 
asset pricing fundamental theorem 
(Lemma), 10-12 
financial model, 7 
portfolio and asset, 7 
pricing measure, 10 
scenario/probability space in, 6 
Singular value decomposition (SVD), 376 
Sinusodial pay-off, 57, 57f 
Smooth pasting condition, 100-103, 101f 
SPD. See Symmetric positive-definite (SPD) 
matrix 
Spectral shift, covariance matrix and, 
376-377, 377 
S-plane, 200 
Standard deviation, 19, 328 
State-dependent asset price, 149 
State-dependent diffusion problem. 
See also F-space process 
State-dependent volatility, 88, 149 
Bessel families of, 215-222, 225, 225f 
States of the world, 316 
Stochastic continuity, 28—29 
Stochastic differential equation (SDE), 
30, 101 
European call option, 40, 43 
Feynman-Kac theorem, 36-37 
geometric Brownian motion, 37-44 
Itô’s formula (Lemma) and, 32-37, 
142, 364 
nonlinear transformations, 33—34 
stock price process, 42 
Stochastic integrals (It6), 24-25, 27n5, 29-31 
Stochastic interest rates, 121—122 
Stochastic process, 5, 6 
adapted process of, 60 
Stochastic volatility, VaR, 292, 293f, 294 
Stock options, 121-122 
Stock price process, SDE, 42 
Stocks, 3, 311 
chooser basket options on two, 43—45 
Stopping domain, 94, 95 
Stopping time, 61 
Straddles, 54 
Strong law of large numbers, 29 
Sturm-Liouville theory, 151, 200 
eigenvalues, 203, 206 
singular, 198 
standard, 192, 197-198 
SVD. See Singular value decomposition 
Swaps 
plain-vanilla, 117-118 
receiver’s interest rate, 118, 119f 
variance, 75—76 


Swaptions, payer, 122, 140, 145-146 
interest rate trees, 387—389, 396 
Symmetric positive-definite (SPD) 
matrix, 375 


T 


Taylor (A) approximations, 252 
Taylor expansion, 33-34, 51, 196, 370 
Theta (©), 51 
Toronto Stock Exchange (TSE), 292 
Trading strategy, 3—4 
Transaction costs, 4 
Transformation reduction methodology, 
210-215 
diffusion canonical transformation, 210, 
211-212, 215 
Fokker-Planck equation (Lemma), 
210-211, 232-233 
invertible variable transformation, 
213-214 
reduction-mapping for pricing kernels, 
214-215, 233-235 
Transition density, 84, 262 
eigenfunction expansions for, 197—199 
Transition probability density, 25, 220f 
Transpose, 240 
Trinomial lattice model 
building, 341-344 
calibration, 346 
computing sensitivities, 348 
drifted, 343-344, 352, 352f 
Hull-White model pricing and calibration 
of, 389-396, 390f 
nondrifted, 342-343, 342f, 346, 347f, 
352, 390f 
pricing barrier options, 346-347, 346f 
pricing procedure, 344-345, 345f 
put-call parity in, 347-348 
Trinomial lattice model, Hull-White 
model and 
downward branching model of, 
390-392, 391f 
first stage: lattice with zero drift, 
389-392, 390f 
normal branching of, 389, 390-392, 391f 
pricing options, 395-396 
second stage: lattice calibrations with drift 
and reversion, 392-395 
upward branching model of, 
390-392, 391f 
Truncation error, 254 
TSE35, 292, 301, 303 


U 


Underlyings, 3 

Up-and-out call, 164-165, 173, 184-185, 
352, 358-360 

Up-and-out put, 164, 173, 358-360 

Upper-wall options, 152 

U.S. Treasury, 147f 


A4 


Value process, 63 
Value-at-risk (VaR), 239, 240f 
95%, surfaces, 302f, 311f 
99%, 302f, 303f, 304, 304f, 311f 
absolute vs. relative, 300-301 
algorithm, 257, 279, 288 
Black-Scholes model, 251, 253-254, 255f 
computing gradient of, 285-286 
covariance estimation and scenario 
generation in, 375-377, 377f 
examples, 281-294 
fast convolution method, 268—280, 281f 
fat tails, 281-284, 282f-284f 
gradient and portfolio composition, 
286-287 
gradient computation and, 287—289 
hedging with, 291-292, 293f 
for long position, 283—284, 284f 
Monte Carlo (MC) methods, delta-gamma 
portfolios and, 369-373, 372f 
numerical methods for AT portfolios, 
261-268, 288 
perturbation theory, 306-312 
portfolio models, 251-254, 255f 
risk-factor aggregation, dimension 
reduction and, 294-303 
risk-factor models, 243-250 
sensitivity analysis, linear approximation 
and, 289, 290f, 291f 
for short position, 282, 282f, 283f 
simple formula of, 240-241 
simulation results, 284—285, 285f 
statistical estimations for AT portfolio 
models, 251-254, 255f 
stochastic volatility, 292, 293f, 294 
well posed, 306-318 
Variance reduction, 261—264 
Variance swaps, 75-76 
logarithmic pay-off, 363-364 
static hedging, replication of logarithmic 
pay-off, 364-367, 365f, 366f 
Variation, 28—29 
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Varswaps, 364 
Vasicek models, 129-134 
Vega (A), 50 
Visual Basic, 321 
Volatility 
Black-Scholes, 74, 225f, 343 
in-the-money vs. out-of-the-money option, 
323, 323f 
lattice, 338-339, 343-344, 350, 351 
stochastic, 292, 293f, 294 
Volatility models 
barrier pricing formulas for 
multiparameter, 229-232 
linear, 172-178, 179f 
quadratic, 178-189, 179f 
state-dependent, 88, 149, 215-222, 
225, 225f 


W 


Weight function, 261 
Weighted scenario, 261 
Wiener process, 35, 41, 139, 150, 173, 189. 
See also Browian motion 
CEV model, 224 
Green’s function, 194—197 
quadratic model, 225f 
single-barrier kernels for simplest models, 
Brownian motion with drift and, 
158-160 
single-barrier kernels for simplest models, 
driftless case and, 152-158 
transformation reduction methodology, 210 
zero-drift, 179-180 
Wingspreads, 54 
Wronskian, 193, 194 
Bessel function and, 200, 202, 206, 208, 
216, 218, 236 


X 


X-space process, 150, 179, 185, 194, 210 

absorption or probability conservation 
conditions, 226—229 

barrier pricing formulas for 
multiparameter volatility models, 
229-232 

Bessel families of state-dependent 
volatility models, 215-222 

constant-elasticity-of-variance model, 
222-224 

generating function, 194, 215 


420 Index 


X-space process (Continued) 
quadratic models, 224-226, 225f 
reduction-mapping for pricing kernels, 
214-215, 233-235 
transformation reduction methodology, 
210-215, 232-233 


Y 


Yield curve, 120 


Z 


Zero boundary condition, 156, 161, 173, 
180, 185, 193, 197, 199, 202, 206 
Zero drift, lattice with, 389-392, 390f 
Zero-coupon bond, 8, 46-47, 55, 72, 106, 
113, 114f, 141, 384f 
bond-forward contract, 114—115, 115f 
interest tree rates, 384f, 385, 388, 392, 
394-397 
Zero-time-decay condition, 100, 102 


